{"id":20714031,"url":"https://github.com/tazeg/hscan","last_synced_at":"2025-04-23T08:11:29.265Z","repository":{"id":57595892,"uuid":"298863978","full_name":"Tazeg/hscan","owner":"Tazeg","description":"Scans recursively a path to match given sha1 checksums.","archived":false,"fork":false,"pushed_at":"2020-09-27T19:25:55.000Z","size":14,"stargazers_count":7,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-29T23:05:06.125Z","etag":null,"topics":["forensic-analysis","forensics","forensics-investigations","golang","sha1","sha1sum"],"latest_commit_sha":null,"homepage":"https://jeffprod.com","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Tazeg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-09-26T17:19:55.000Z","updated_at":"2022-02-18T00:29:38.000Z","dependencies_parsed_at":"2022-09-26T20:01:18.784Z","dependency_job_id":null,"html_url":"https://github.com/Tazeg/hscan","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tazeg%2Fhscan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tazeg%2Fhscan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tazeg%2Fhscan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tazeg%2Fhscan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Tazeg","download_url":"https://codeload.github.com/Tazeg/hscan/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250395287,"owners_count":21423400,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["forensic-analysis","forensics","forensics-investigations","golang","sha1","sha1sum"],"created_at":"2024-11-17T02:28:45.689Z","updated_at":"2025-04-23T08:11:29.243Z","avatar_url":"https://github.com/Tazeg.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HSCAN\n\nScans recursively a path to match given sha1 checksums.\nUsefull to find duplicate files, or to find relevant/irrelevant/unknown files.\n\n## USAGE\n\n```bash\nhscan -d \u003cPATH\u003e -db \u003cPATH\u003e\n-d string\n      Directory to scan recursively\n-db string\n      Directory containing text files with sha1 to search (1 checksum by line)\n```\n\n## EXAMPLE\n\nYou have the file `dbpath/sha1.txt` :\n\n```\nfed5cdfb1c9b121ea6d042dd54842407df3b4a6b\n64725786589f263f0ecc1da55c2bcac7eb18e681\n12d81f50767d4e09aa7877da077ad9d1b915d75b\n```\n\nSearching for files having those checksums in the directory `test/` :\n\n```bash\nhscan -d test -db dbpath\n\n# result :\nLoading database file \"dbpath/sha1.txt\"... 3 uniq checksum found in \"46.975µs\"\n\nScanning path \"tmp\"...\n  1964 files - 0 unreadable files - 492 dirs - 0 unreadable dirs - 3 matches\n\nRESULT\n  sha1tmp.txt                              : 3 matches\n  Total                                    : 3 matches\n\nDone in 292.09673ms\n```\n\nMatching files, unknown files, and errors are written in real time into `result.csv` :\n\n```csv\n# sha1,dbfile,filename,error\ndff8a1731f59ccad056b346102d1e1d014b843f3,nsrl_uniq.txt,/home/jeff/tmp/.vscode/settings.json,\n0841f15b7436126cb2877b094d632dbc2707eda0,,/home/jeff/tmp/img_20190502_175115.jpg,\n98fb7452234c1d7666a54a53eb7340e501d8c173,sha1test.txt,/home/jeff/tmp/602352874.jpg,\n,,/home/jeff/tmp/mysqltmp/undo_001,open /home/jeff/tmp/mysqltmp/undo_001: permission denied\n```\n\nA SQLite3 database named `result.db` with the same data as the CSV is created at the end of the process.\n\n## INSTALL\n\nGet the [latest release](https://github.com/Tazeg/hscan/releases) or download and install from source :\n\n```bash\ngit config --global --add url.\"git@github.com:\".insteadOf \"https://github.com/\"\ngo get github.com/Tazeg/hscan\ncd ~/go/src/github.com/Tazeg/hscan\n\n# Linux\nenv GOOS=linux GOARCH=amd64 go build hscan.go\n\n# Windows\nenv GOOS=windows GOARCH=amd64 go build -o hscan.exe hscan.go\n\n# Raspberry Pi\nenv GOARM=7 GOARCH=arm go build hscan.go\n\ngo install\n```\n\n## TEST\n\n```bash\ngo test\n```\n\n## BENCHMARKS\n\nTried on :\n\n- OS : Linux\n- HDD : 128 Gb SSD + 2 Tb HDD\n- CPU: Intel(R) Xeon(R) CPU E5-1660 v3 @ 3.00GHz\n- Memory: 32 Gb\n\nLoading a NIST/NSRL file of 1,2Gb containing 29,459,433 took 22.14s.\nScanning 2Tb and 128 Gb of data took 1h32m34s. This depends on the data stored and the free space on the drive. Further tests will be done shortly.\n\n```bash\n$\u003e hscan -d / -db bases_hash/\nLoading database file \"bases_hash/nsrl_sha1_uniq.txt\"... 29459433 uniq checksum found in \"22.146464941s\"\n\nScanning path \"/\"...\n  2012574 files - 12091 unreadable files - 274715 dirs - 2510 unreadable dirs - 287870 matches\n\nRESULT\n  nsrl_sha1_uniq.txt                       : 287870 matches\n  Total                                    : 287870 matches\n\nDone in 1h32m34.505006098s\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftazeg%2Fhscan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftazeg%2Fhscan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftazeg%2Fhscan/lists"}