{"id":15405030,"url":"https://github.com/razum2um/xxhashdir_comm","last_synced_at":"2025-10-27T15:40:48.468Z","repository":{"id":139074528,"uuid":"411511125","full_name":"razum2um/xxhashdir_comm","owner":"razum2um","description":"🏭 identifies common or duplicates across different hosts","archived":false,"fork":false,"pushed_at":"2021-09-29T03:33:57.000Z","size":8,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-01T19:13:37.601Z","etag":null,"topics":["difference-detection","duplicate-detection","xxhash","xxhashdir"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/razum2um.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-29T03:01:59.000Z","updated_at":"2023-12-17T22:39:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"6dcd487a-d0b4-437d-915c-022b028876b5","html_url":"https://github.com/razum2um/xxhashdir_comm","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/razum2um%2Fxxhashdir_comm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/razum2um%2Fxxhashdir_comm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/razum2um%2Fxxhashdir_comm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/razum2um%2Fxxhashdir_comm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/razum2um","download_url":"https://codeload.github.com/razum2um/xxhashdir_comm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245868327,"owners_count":20685609,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["difference-detection","duplicate-detection","xxhash","xxhashdir"],"created_at":"2024-10-01T16:14:50.016Z","updated_at":"2025-10-27T15:40:43.427Z","avatar_url":"https://github.com/razum2um.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# xxhashdir_comm\n\n[![CI](https://github.com/razum2um/xxhashdir_comm/actions/workflows/rust.yml/badge.svg)](https://github.com/razum2um/xxhashdir_comm/actions/workflows/rust.yml)\n\n## Problem\n\n- Sometimes you did a backup by \"hey, let's just rsync it somewhere\"\n- Now you struggle to merge such \"backups\" to save space across hosts and drives?\n\nThis helps to identify common or duplicates across different hosts\nusing collected [xxhashdir](https://github.com/lunatic-cat/xxhashdir) results (plaintext in format: `\\d{0,20}  .*` with first column is `xxhash` checksum)\n\n## Howto\n\n### Prepare files with checksums\n\n```sh\n# on remote host\nxxhashdir . \u003e remote.xxhashdir\n# on local host\nxxhashdir . \u003e local.xxhashdir\nscp remote:remote.xxhashdir remote.xxhashdir\n```\n\n### Usage\n\n```sh\n# 🚀 to get common files (sources are mostly different)\n# you likely want to know this to delete duplicates first, then copy rest\nxxhashdir_comm --common local.xxhashdir remote.xxhashdir\n\n# 🚀 to get different files (sources are mostly equal)\n# you likely want to know this to merge uniq files from second into first, then delete the second at all\nxxhashdir_comm --only-second local.xxhashdir remote.xxhashdir\n```\n\n## Why not _\n\n- `rsync` can delete files on reciever, but relies only on filenames and mtime\n- `fdupes` works only locally\n- `zfs snapshot` + `zfs diff` is perfect but also only local and requires to be a common dataset initially\n- incremental backups - you don't always bother to have\n## Why use it\n\n- stdout can be reprocessed with sed/grep/whatever again\n- unix way\n- having fun with rust\n\n## Further plans\n\n- Unify output with standart `comm` utility (columns \u0026 accept `-1/2/3`)\n- Consider `xxhashdir` with bytesize input and compare bytesizes\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frazum2um%2Fxxhashdir_comm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frazum2um%2Fxxhashdir_comm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frazum2um%2Fxxhashdir_comm/lists"}