{"id":13930194,"url":"https://github.com/Canop/backdown","last_synced_at":"2025-07-19T12:31:54.222Z","repository":{"id":50658504,"uuid":"289356392","full_name":"Canop/backdown","owner":"Canop","description":"A deduplicator","archived":false,"fork":false,"pushed_at":"2024-09-10T17:25:33.000Z","size":554,"stargazers_count":121,"open_issues_count":2,"forks_count":7,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-11-13T09:59:08.111Z","etag":null,"topics":["duplicates","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Canop.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null},"funding":{"github":["Canop"]}},"created_at":"2020-08-21T20:11:49.000Z","updated_at":"2024-10-23T19:00:25.000Z","dependencies_parsed_at":"2024-01-07T21:02:15.652Z","dependency_job_id":"2bd9351c-4d3b-4259-94a0-0187c8108cd7","html_url":"https://github.com/Canop/backdown","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Canop%2Fbackdown","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Canop%2Fbackdown/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Canop%2Fbackdown/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Canop%2Fbackdown/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Canop","download_url":"https://codeload.github.com/Canop/backdown/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":226607604,"owners_count":17658483,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["duplicates","rust"],"created_at":"2024-08-07T18:05:05.713Z","updated_at":"2024-11-26T19:30:55.583Z","avatar_url":"https://github.com/Canop.png","language":"Rust","funding_links":["https://github.com/sponsors/Canop"],"categories":["Rust","rust","\u003ca name=\"file-dir-cleanup\"\u003e\u003c/a\u003eClean up of files and directories"],"sub_categories":[],"readme":"# backdown\n\n[![MIT][s2]][l2] [![Latest Version][s1]][l1] [![Build][s3]][l3] [![Chat on Miaou][s4]][l4]\n\n[s1]: https://img.shields.io/crates/v/backdown.svg\n[l1]: https://crates.io/crates/backdown\n\n[s2]: https://img.shields.io/badge/license-MIT-blue.svg\n[l2]: LICENSE\n\n[s3]: https://github.com/Canop/backdown/actions/workflows/rust.yml/badge.svg\n[l3]: https://github.com/Canop/backdown/actions/workflows/rust.yml\n\n[s4]: https://miaou.dystroy.org/static/shields/room.svg\n[l4]: https://miaou.dystroy.org/3768?Rust\n\n**Backdown** helps you safely and ergonomically remove duplicate files.\n\nIts design is based upon my observation of frequent patterns regarding build-up of duplicates with time, especially images and other media files.\n\nFinding duplicates is easy. Cleaning the disk when there are thousands of them is the hard part. What Backdown brings is the easy way to select and remove the duplicates you don't want to keep.\n\nA Backdown session goes through the following phases:\n\n1. Backdown analyzes the directory of your choice and find sets of duplicates (files whose content is exactly the same). Backdown ignores symlinks and files or directories whose name starts with a dot.\n2. Backdown asks you a few questions depending on the analysis. Nothing is removed at this point: you only stage files for removal. Backdown never lets you stage all items in a set of identical files\n3. After having maybe looked at the list of staged files, you confirm the removals\n4. Backdown does the removals on disk\n\n# What it looks like\n\nAnalysis and first question:\n\n![screen 1](doc/screen-1.png)\n\nAnother kind of question:\n\n![screen 2](doc/screen-2.png)\n\nYet another one:\n\n![screen 3](doc/screen-3.png)\n\nYet another one:\n\n![screen 4](doc/screen-4.png)\n\nReview and Confirm:\n\n![screen 5](doc/screen-5.png)\n\nAt this point you may also export the report as JSON, and you may decide to replace each removed file with a link to one of the kept ones.\n\n# Installation\n\n## From the crates.io repository\n\nYou must have the Rust env installed: https://rustup.rs\n\nRun\n\n```bash\ncargo install --locked backdown\n```\n\n## From Source\n\nYou must have the Rust env installed: https://rustup.rs\n\nDownload this repository then run\n\n```bash\ncargo install --path .\n```\n\n## Precompiled binaries\n\nUnless you're a Rust developer, I recommend you just download the precompiled binaries, as this will save a lot of space on your disk.\n\nBinaries are made available at https://dystroy.org/backdown/download/\n\n# Usage\n\n## Deduplicate any kind of files\n\n```bash\nbackdown /some/directory\n```\n\n## Deduplicate images\n\n```bash\nbackdown -i /some/directory\n```\n\n## JSON report\n\nAfter the staging phase, you may decide to export a report as JSON. This doesn't prevent doing also the removals.\n\nThe JSON looks like this:\n\n```JSON\n{\n  \"dup_sets\": [\n    {\n      \"file_len\": 1212746,\n      \"files\": {\n        \"trav-copy/2006-05 (mai)/HPIM0530.JPG\": \"remove\",\n        \"trav-copy/2006-06 (juin)/HPIM0530 (another copy).JPG\": \"remove\",\n        \"trav-copy/2006-06 (juin)/HPIM0530 (copy).JPG\": \"remove\",\n        \"trav-copy/2006-06 (juin)/HPIM0530.JPG\": \"keep\"\n      }\n    },\n    {\n      \"file_len\": 1980628,\n      \"files\": {\n        \"trav-copy/2006-03 (mars)/HPIM0608.JPG\": \"keep\",\n        \"trav-copy/2006-05 (mai)/HPIM0608.JPG\": \"remove\",\n        \"trav-copy/2006-06 (juin)/HPIM0608.JPG\": \"keep\"\n      }\n    },\n    {\n      \"file_len\": 1124764,\n      \"files\": {\n        \"trav-copy/2006-05 (mai)/HPIM0529.JPG\": \"remove\",\n        \"trav-copy/2006-06 (juin)/HPIM0529.JPG\": \"keep\"\n      }\n    },\n    {\n      \"file_len\": 1706672,\n      \"files\": {\n        \"trav-copy/2006-05 (mai)/test.jpg\": \"remove\",\n        \"trav-copy/2006-06 (juin)/HPIM0598.JPG\": \"keep\"\n      }\n    }\n  ],\n  \"len_to_remove\": 8450302\n}\n```\n\n# Advice\n\n* If you launch backdown on a big directory, it may find more duplicates you suspect there are. Don't force yourself to answer *all* questions at first: if you stage the removals of the first dozen questions you'll gain already a lot and you may do the other ones another day\n* Don't launch backdown at the root of your disk because you don't want to try and deal with duplicates in system resources, programs, build artefacts, etc. Launch backdown where you store your images, or your videos or musics\n* Backdown isn't designed for dev directories and doesn't respect .gitignore rules\n* If you launch backdown in a directory with millions files on a slow disk, you'll have to wait a long time while the content is hashed. Try with a smaller directory first if you have an HDD\n* If you're only interested in images, use the -i option\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FCanop%2Fbackdown","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FCanop%2Fbackdown","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FCanop%2Fbackdown/lists"}