{"id":18030842,"url":"https://github.com/atc0005/bridge","last_synced_at":"2025-09-18T00:32:32.243Z","repository":{"id":38842976,"uuid":"230858435","full_name":"atc0005/bridge","owner":"atc0005","description":"A small CLI utility used to find duplicate files","archived":false,"fork":false,"pushed_at":"2024-12-19T08:22:43.000Z","size":4012,"stargazers_count":6,"open_issues_count":18,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-12-31T19:07:45.180Z","etag":null,"topics":["duplicate","file","go"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/atc0005.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-12-30T06:10:32.000Z","updated_at":"2024-12-04T16:00:52.000Z","dependencies_parsed_at":"2023-12-20T14:59:25.046Z","dependency_job_id":"8262cf74-f6ca-4671-8d08-6092fe226bfc","html_url":"https://github.com/atc0005/bridge","commit_stats":null,"previous_names":[],"tags_count":56,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atc0005%2Fbridge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atc0005%2Fbridge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atc0005%2Fbridge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atc0005%2Fbridge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/atc0005","download_url":"https://codeload.github.com/atc0005/bridge/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":233433642,"owners_count":18675601,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["duplicate","file","go"],"created_at":"2024-10-30T09:15:19.825Z","updated_at":"2025-09-18T00:32:31.468Z","avatar_url":"https://github.com/atc0005.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!-- omit in toc --\u003e\n# Bridge\n\nA small CLI utility used to find duplicate files.\n\n[![Latest Release](https://img.shields.io/github/release/atc0005/bridge.svg?style=flat-square)][repo-url]\n[![Go Reference](https://pkg.go.dev/badge/github.com/atc0005/bridge.svg)](https://pkg.go.dev/github.com/atc0005/bridge)\n[![go.mod Go version](https://img.shields.io/github/go-mod/go-version/atc0005/bridge)](https://github.com/atc0005/bridge)\n[![Lint and Build](https://github.com/atc0005/bridge/actions/workflows/lint-and-build.yml/badge.svg)](https://github.com/atc0005/bridge/actions/workflows/lint-and-build.yml)\n[![Project Analysis](https://github.com/atc0005/bridge/actions/workflows/project-analysis.yml/badge.svg)](https://github.com/atc0005/bridge/actions/workflows/project-analysis.yml)\n\n\u003c!-- omit in toc --\u003e\n## Table of Contents\n\n- [Project home](#project-home)\n- [Overview](#overview)\n  - [Generate report](#generate-report)\n  - [Prune duplicate files](#prune-duplicate-files)\n- [Features](#features)\n- [Changelog](#changelog)\n- [Requirements](#requirements)\n  - [Building source code](#building-source-code)\n  - [Running](#running)\n- [Installation](#installation)\n  - [From source](#from-source)\n  - [Using release binaries](#using-release-binaries)\n- [Configuration Options](#configuration-options)\n  - [Command-line Arguments](#command-line-arguments)\n    - [`report` subcommand](#report-subcommand)\n    - [`prune` subcommand](#prune-subcommand)\n- [Examples](#examples)\n  - [Generating a report](#generating-a-report)\n    - [Single path, recursive](#single-path-recursive)\n    - [Multiple paths, non-recursive](#multiple-paths-non-recursive)\n    - [Invalid flag](#invalid-flag)\n  - [Pruning duplicate files](#pruning-duplicate-files)\n    - [Dry-run (minimal)](#dry-run-minimal)\n    - [Dry-run (verbose)](#dry-run-verbose)\n    - [Backup files before removing them](#backup-files-before-removing-them)\n- [License](#license)\n  - [Core project files](#core-project-files)\n  - [`ByteCountSI`, `ByteCountIEC` functions](#bytecountsi-bytecountiec-functions)\n- [References](#references)\n\n## Project home\n\nSee [our GitHub repo][repo-url] for the latest code, to file an issue or\nsubmit improvements for review and potential inclusion into the project.\n\n## Overview\n\n1. Generate report\n   - Find duplicate files and report them via console-only output or an output\n     CSV file\n1. Remove flagged files\n   - Process CSV file report generated earlier: if flag is set,\n     (optionally) backup and then remove marked files\n\n### Generate report\n\nGenerating a report is the first step towards indicating which files from a\nduplicate file set that you wish to remove (specified explicitly) and which\nyou wish to keep (default behavior).\n\n### Prune duplicate files\n\nPruning duplicate files is an optional second step following the generation of\na duplicate files report (via the `report` subcommand).\n\nYou first open the CSV file using an application like Microsoft Excel or\nLibreOffice Calc and then mark each file (`remove_file` column) that you wish\nto remove with either `true` or `false`; the default is `false`, so marking an\nentry with `false` is not strictly necessary.\n\nOnce marked, you are then able to remove those files by specifying the full\npath to the CSV file (via the `prune` subcommand). See the\n[Examples](#examples) section for details.\n\n## Features\n\n- Efficient evaluation of potential duplicates by limiting checksum generation\n  to two or more identically sized files\n- Support for creating CSV report of all duplicate file matches\n- Support for generating (rough) console equivalent of CSV file for\n  (potential) quick review\n- Support for creating Microsoft Excel workbook of all duplicate file matches\n- Support for evaluating one or many paths\n- Recursive or shallow directory evaluation\n- Optional removal of (user-flagged) duplicate files from a previously\n  generated CSV report\n- Go modules (vs classic `GOPATH` setup)\n\n## Changelog\n\nSee the [`CHANGELOG.md`](CHANGELOG.md) file for the changes associated with\neach release of this application. Changes that have been merged to `master`,\nbut not yet an official release may also be noted in the file under the\n`Unreleased` section. A helpful link to the Git commit history since the last\nofficial release is also provided for further review.\n\n## Requirements\n\nThe following is a loose guideline. Other combinations of Go and operating\nsystems for building and running tools from this repo may work, but have not\nbeen tested.\n\n### Building source code\n\n- Go\n  - see this project's `go.mod` file for *preferred* version\n  - this project tests against [officially supported Go\n    releases][go-supported-releases]\n    - the most recent stable release (aka, \"stable\")\n    - the prior, but still supported release (aka, \"oldstable\")\n- GCC\n  - if building with custom options (as the provided `Makefile` does)\n- `make`\n  - if using the provided `Makefile`\n\n### Running\n\n- Windows 10\n- Ubuntu Linux 18.04+\n\n## Installation\n\n### From source\n\n1. [Download][go-docs-download] Go\n1. [Install][go-docs-install] Go\n1. Clone the repo\n   1. `cd /tmp`\n   1. `git clone https://github.com/atc0005/bridge`\n   1. `cd bridge`\n1. Install dependencies (optional)\n   - for Ubuntu Linux\n     - `sudo apt-get install make gcc`\n   - for CentOS Linux\n     1. `sudo yum install make gcc`\n1. Build\n   - for current operating system\n     - `go build -mod=vendor ./cmd/bridge/`\n       - *forces build to use bundled dependencies in top-level `vendor`\n         folder*\n   - for all supported platforms (where `make` is installed)\n      - `make all`\n   - for Windows\n      - `make windows`\n   - for Linux\n     - `make linux`\n1. Copy the applicable binary to whatever systems needs to run it\n   - if using `Makefile`: look in `/tmp/release_assets/bridge/`\n   - if using `go build`: look in `/tmp/bridge/`\n\n**NOTE**: Depending on which `Makefile` recipe you use the generated binary\nmay be compressed and have an `xz` extension. If so, you should decompress the\nbinary first before deploying it (e.g., `xz -d bridge-linux-amd64.xz`).\n\n### Using release binaries\n\n1. Download the [latest release][repo-url] binaries\n1. Decompress binaries\n   - e.g., `xz -d bridge-linux-amd64.xz`\n1. Deploy\n   - Place `bridge` in a location of your choice\n     - e.g., `/usr/local/bin/bridge`\n\n**NOTE**:\n\nDEB and RPM packages are provided as an alternative to manually deploying\nbinaries.\n\n## Configuration Options\n\n### Command-line Arguments\n\n#### `report` subcommand\n\n| Option          | Required | Default        | Repeat | Possible                            | Description                                                                                                                                  |\n| --------------- | -------- | -------------- | ------ | ----------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |\n| `h`, `help`     | No       | `false`        | No     | `h`, `help`                         | Show Help text along with the list of supported flags.                                                                                       |\n| `console`       | No       | `false`        | No     | `true`, `false`                     | Dump (approximate) CSV file equivalent to console.                                                                                           |\n| `csvfile`       | Yes      | *empty string* | No     | *valid file name characters*        | The fully-qualified path to a CSV file that this application should generate.                                                                |\n| `excelfile`     | No       | *empty string* | No     | *valid file name characters*        | The fully-qualified path to a Microsoft Excel file that this application should generate.                                                    |\n| `size`          | No       | `1` (byte)     | No     | `0+`                                | File size limit for evaluation. Files smaller than this will be skipped.                                                                     |\n| `duplicates`    | No       | `2`            | No     | `2+`                                | Number of files of the same file size needed before duplicate validation logic is applied.                                                   |\n| `ignore-errors` | No       | `false`        | No     | `true`, `false`                     | Ignore minor errors whenever possible. This option does not affect handling of fatal errors such as failure to generate output report files. |\n| `path`          | Yes      | *empty string* | Yes    | *one or more valid directory paths* | Path to process. This flag may be repeated for each additional path to evaluate.                                                             |\n| `recurse`       | No       | `false`        | No     | `true`, `false`                     | Perform recursive search into subdirectories per provided path.                                                                              |\n\n#### `prune` subcommand\n\n| Option          | Required | Default        | Repeat | Possible                     | Description                                                                                                                                                                     |\n| --------------- | -------- | -------------- | ------ | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `h`, `help`     | No       | `false`        | No     | `h`, `help`                  | Show Help text along with the list of supported flags.                                                                                                                          |\n| `console`       | No       | `false`        | No     | `true`, `false`              | Dump (approximate) CSV file equivalent to console.                                                                                                                              |\n| `dry-run`       | No       | `false`        | No     | `true`, `false`              | Don't actually remove files. Echo what would have been done to stdout.                                                                                                          |\n| `ignore-errors` | No       | `false`        | No     | `true`, `false`              | Ignore minor errors whenever possible. This option does not affect handling of fatal errors such as failure to generate output report files.                                    |\n| `input-csvfile` | Yes      | *empty string* | No     | *valid file name characters* | The fully-qualified path to a CSV file that this application should use for file removal decisions.                                                                             |\n| `backup-dir`    | No       | *empty string* | No     | *valid directory path*       | The writable directory path where files should be relocated instead of removing them. The original path structure will be created starting with the specified path as the root. |\n| `blank-line`    | No       | `false`        | No     | `true`, `false`              | Add a blank line between sets of matching files in console and file output.                                                                                                     |\n| `use-first-row` | No       | `false`        | No     | `true`, `false`              | Attempt to use the first row of the input file. Normally this row is skipped since it is usually the header row and not duplicate file data.                                    |\n\n## Examples\n\n### Generating a report\n\n#### Single path, recursive\n\nThis example illustrates using the application to process a single path,\nrecursively.\n\n```ShellSession\n./bridge.exe report -recurse -path \"/tmp/path1\" -csvfile \"path1-report.csv\"\n```\n\n#### Multiple paths, non-recursive\n\nThis example illustrates using the application to process multiple paths,\nwithout recursively evaluating any subdirectories.\n\n```ShellSession\n./bridge.exe report -path \"/tmp/path1\" -path \"/tmp/path2\"  -csvfile \"report.csv\"\n```\n\n#### Invalid flag\n\nAccidentally typing the wrong flag results in a message like this one:\n\n```ShellSession\n$ ./bridge.exe report -fake-flag\nDEBUG: subcommand 'report'\nflag provided but not defined: -fake-flag\n\nbridge x.y.z\nhttps://github.com/atc0005/bridge\n\nUsage of \"bridge report\":\n  -console\n        Dump (approximate) CSV file equivalent to console.\n  -csvfile string\n        The (required) fully-qualified path to a CSV file that this application should generate.\n  -duplicates int\n        Number of files of the same file size needed before duplicate validation logic is applied. (default 2)\n  -excelfile string\n        The (optional) fully-qualified path to an Excel file that this application should generate.\n  -ignore-errors\n        Ignore minor errors whenever possible. This option does not affect handling of fatal errors such as failure to generate output report files.\n  -path value\n        Path to process. This flag may be repeated for each additional path to evaluate.\n  -recurse\n        Perform recursive search into subdirectories per provided path.\n  -size int\n        File size limit (in bytes) for evaluation. Files smaller than this will be skipped. (default 1)\nDEBUG: err returned from reportCmd.Parse(): flag provided but not defined: -fake-flag\n\nERROR: flag provided but not defined: -fake-flag\n```\n\n### Pruning duplicate files\n\n#### Dry-run (minimal)\n\n```ShellSession\n./bridge.exe prune -input-csvfile \"report.csv\" -dry-run -ignore-errors\n```\n\nHere we specify:\n\n- Don't actually remove files, just simulate the process\n- input CSV file (file previously generated by the `report` subcommand)\n- ignore (minor) errors\n\nBecause the `console` flag wasn't specified, the output is minimal.\n\n#### Dry-run (verbose)\n\n```ShellSession\n./bridge.exe prune -input-csvfile \"report.csv\" -dry-run -ignore-errors -console\n```\n\nHere we specify:\n\n- Don't actually remove files, just simulate the process\n- input CSV file (file previously generated by the `report` subcommand)\n- ignore (minor) errors\n- `console` flag\n  - enables printing table of parsed CSV contents\n  - enables printing table of file removal candidates\n\nBecause the `console` flag *was* specified, the output is more verbose.\n\n#### Backup files before removing them\n\n```ShellSession\n./bridge.exe prune -input-csvfile \"report.csv\" -backup-dir /tmp/tacos -dry-run -ignore-errors -console\n```\n\nHere we specify:\n\n- the input CSV file (file previously generated by the `report` subcommand)\n- the backup directory that should be used to copy files to (just before a\n  file removal operation is attempted)\n- ignore (minor) errors\n- `console` flag\n  - enables printing table of parsed CSV contents\n  - enables printing table of file removal candidates\n\nBecause the `console` flag *was* specified, the output is more verbose. This\ncan make the removal process easier to troubleshoot due to the explicit\nlisting of what *would* be removed and what actually occurred.\n\n## License\n\n### Core project files\n\nFrom the [LICENSE](LICENSE) file:\n\n```license\nMIT License\n\nCopyright (c) 2020 Adam Chalkley\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n```\n\n### `ByteCountSI`, `ByteCountIEC` functions\n\nThese utility functions are provided by **Stefan Nilsson** under the\n**Attribution 3.0 Unported (CC BY 3.0)** license. See the **References** section\nof this document for links to additional information.\n\n## References\n\n- \u003chttps://yourbasic.org/golang/formatting-byte-size-to-human-readable-format/\u003e\n  - \u003chttps://yourbasic.org/golang/byte-count.go\u003e\n  - \u003chttps://creativecommons.org/licenses/by/3.0/\u003e\n\n- \u003chttps://stackoverflow.com/questions/28322997/how-to-get-a-list-of-values-into-a-flag-in-golang\u003e\n- \u003chttps://golang.org/pkg/flag/#Value\u003e\n- \u003chttps://gobyexample.com/command-line-subcommands\u003e\n\n- \u003chttps://stackoverflow.com/questions/50324612/merge-maps-in-golang/50325337#50325337\u003e\n- \u003chttps://yourbasic.org/golang/gotcha-change-value-range/\u003e\n\n- \u003chttps://www.digitalocean.com/community/tutorials/understanding-defer-in-go\u003e\n- \u003chttps://golangcode.com/writing-to-file/\u003e\n- \u003chttps://www.joeshaw.org/dont-defer-close-on-writable-files/\u003e\n- \u003chttps://golang.org/pkg/os/#File.Sync\u003e\n- \u003chttps://www.linode.com/docs/development/go/creating-reading-and-writing-files-in-go-a-tutorial/\u003e\n\n- \u003chttps://medium.com/@sebassegros/golang-dealing-with-maligned-structs-9b77bacf4b97\u003e\n\n- \u003chttps://goenning.net/2017/01/25/adding-custom-data-go-binaries-compile-time/\u003e\n  - covers updating variables at build time, particularly sub-packages (GH-55)\n\n- \u003chttps://github.com/360EntSecGroup-Skylar/excelize\u003e\n\n\u003c!-- Footnotes here  --\u003e\n\n[repo-url]: \u003chttps://github.com/atc0005/bridge\u003e  \"This project's GitHub repo\"\n\n[go-docs-download]: \u003chttps://golang.org/dl\u003e  \"Download Go\"\n\n[go-docs-install]: \u003chttps://golang.org/doc/install\u003e  \"Install Go\"\n\n[go-supported-releases]: \u003chttps://go.dev/doc/devel/release#policy\u003e \"Go Release Policy\"\n\n\u003c!-- []: PLACEHOLDER \"DESCRIPTION_HERE\" --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatc0005%2Fbridge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fatc0005%2Fbridge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatc0005%2Fbridge/lists"}