{"id":15067604,"url":"https://github.com/szabgab/rust-digger","last_synced_at":"2025-04-10T14:22:46.936Z","repository":{"id":177000296,"uuid":"656001370","full_name":"szabgab/rust-digger","owner":"szabgab","description":null,"archived":false,"fork":false,"pushed_at":"2024-05-01T17:08:06.000Z","size":2874,"stargazers_count":16,"open_issues_count":42,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-05-01T23:27:25.613Z","etag":null,"topics":["git","github","rust","rust-lang","rustlang"],"latest_commit_sha":null,"homepage":"https://rust-digger.code-maven.com/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/szabgab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"szabgab"}},"created_at":"2023-06-20T04:20:10.000Z","updated_at":"2024-05-03T12:59:11.301Z","dependencies_parsed_at":"2023-09-26T18:33:23.237Z","dependency_job_id":"3572bb3f-4060-41cf-a819-fae104a26da2","html_url":"https://github.com/szabgab/rust-digger","commit_stats":null,"previous_names":["szabgab/rust-digger"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/szabgab%2Frust-digger","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/szabgab%2Frust-digger/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/szabgab%2Frust-digger/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/szabgab%2Frust-digger/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/szabgab","download_url":"https://codeload.github.com/szabgab/rust-digger/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248233936,"owners_count":21069493,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["git","github","rust","rust-lang","rustlang"],"created_at":"2024-09-25T01:25:10.637Z","updated_at":"2025-04-10T14:22:46.920Z","avatar_url":"https://github.com/szabgab.png","language":"Rust","funding_links":["https://github.com/sponsors/szabgab"],"categories":[],"sub_categories":[],"readme":"# [Rust Digger](https://rust-digger.code-maven.com/)\n\n* Analyze Rust Crates, help evaluation and suggest improvements\n\n* Fetch list of [Crates](https://crates.io/)\n* Process the data\n* Generate static HTML pages\n\n## Contribution\n\nPlease send small pull-requests and make sure each PR changes one thing.\n\nIf you would like to implement a feature, but first you need to refactor the code, please send the PR to refactor the code\nand only once I accepted that send the change to implement the feature. This might sound frustrating, but I am\nnot very good at code reviews so if I get a long PR that changes several things that don't have to be changed at once,\nthen I might not understand it and I might not accept it. That would be a lot more frustrating to both of us.\n\nPlease either set up the `pre-commit hooks` as described below or run `cargo fmt`, `cargo clippy`, and `cargo test`\nmanually before committing code.\n\n## Local development environment\n\n```\ngit clone https://github.com/szabgab/rust-digger.git\ncd rust-digger\n```\n\nOptionally install [pre-commit](https://pre-commit.com/) and then run `pre-commit install` to configure it on this project.\n\nDownload the data from static.crates.io\n\n```\ncargo run --bin rust-digger-download-db-dump\n```\n\nClone 15 repositories of the crates that were release in the last 10 days:\n\n```\ncargo run --bin rust-digger-clone -- --recent 10 --limit 15\n```\n\nCollect data from 15 repositories (VCSs) we cloned. (You can use any number there)\n\n```\ncargo run --bin rust-digger-vcs -- --limit 10\n```\n\nDownload some of the released crates from Crates.io\n\n```\ncargo run --bin rust-digger-download-crates -- --limit 10\n```\n\nGenerate the static html pages for 10 crates.\n\n```\ncargo run --bin rust-digger-html -- --limit 10\n```\n\nTo run a local web server to serve the static files install [ruststatic](https://github.com/szabgab/rustatic) using:\n\n```\ncargo install rustatic\n```\n\n\nand then run:\n\n```\nrustatic --nice --indexfile index.html --path _site/\n```\n\n\n## Deployment on Ubuntu-based server\n\nBased on https://www.rust-lang.org/tools/install\n```\nsudo apt install pkg-config\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\ncargo build --release\n```\n\nThere is a cron-job that runs `all.sh` once a day. (As long as we use the dumped data from Crates.io, there is no point in running more frequently.)\n\n## Processing steps\n\n### Fetching data from crates.io\n\nDiscussed here: https://crates.io/data-access\n\nAs of 2024.03.26\n\n1. The git repository https://github.com/rust-lang/crates.io-index does not contain the meta data, such as the github URL\n1. The https://static.crates.io/db-dump.tar.gz is 305 Mb It unzipped to a timestamped folder called `YYYY-MM-DD-020046` which is 1.1 Gb and contains CSV dumps of a PostgreSQL database.\n\nThe fetching and unzipping is done by the `rust-digger-download` binary.\n\nFor each crate (or for each new crate if we start working incrementally) check if it refers to a repo.\nFor each repo maintain a file called repo-details/github/repo-name.json in this repo we keep all the information we collected about the repository. When generating the HTML files we consult these files. These files are also updated by the stand-alone processes listed below.\nThe files are mapped with the Details struct.\n\n\n### Cloning repositories\n\n* `git pull` takes 0.3 sec when it does not need to copy any files.\n* There are  123,216 crates\n* Assuming all of them will have git repositories and most of them won't change we'll need\n  123,000 * 0.3 = 41,000 sec to update all the repos = that is 683 minutes = 11.5 hours.\n\n\nIf we fail to clone the repository we add this information to the repo-details file of the repository.\n\n### Analyzing repositories\n\n* Some information is easy and fast to collect. (e.g. checking if there are YAML files in `.github/workflows` to check if GitHub Actions is configured)\n\n\n* TODO: if there are more than one crates in the repo, should we analyze and report the crates separately?\n\n### Docker\n\ndocker build -t rust-test .\ndocker run --rm -it -v$(pwd):/crate --workdir /crate  --user tester rust-test\n\n### cargo fmt\n\n* Running `cargo fmt --check -- --color=never` and capturing the STDOUT and the exit code. We save them together with the current sha of the repository `git rev-parse HEAD` and the date of processing. (We might also want to save the version of rustfmt `cargo fmt --version` and the version of rustc `rustc --version`)\n\n```\ncargo run --bin fmt -- --limit 10\n```\n\n### cargo fix\n\n\n### cargo test\n\n\n### Collect test coverage report\n\n\n```\nrustup toolchain install nightly\nrustup default nightly\n\ncargo install rustfilt\ncargo clean\n\nRUSTFLAGS=\"-C instrument-coverage\" cargo build\nRUSTFLAGS=\"-C instrument-coverage\" cargo test --tests\nllvm-profdata merge -sparse *.profraw -o x.profdata\n\n\ncargo install cargo-tarpaulin\ncargo tarpaulin --workspace --out html --out json\n```\n\n\n## Related Sites\n\n* https://crates.io/\n* https://docs.rs/\n* https://lib.rs/\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fszabgab%2Frust-digger","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fszabgab%2Frust-digger","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fszabgab%2Frust-digger/lists"}