{"id":27009873,"url":"https://github.com/streanger/duplicate","last_synced_at":"2025-04-04T10:52:57.788Z","repository":{"id":132640350,"uuid":"560601996","full_name":"streanger/duplicate","owner":"streanger","description":"files duplicate viewer","archived":false,"fork":false,"pushed_at":"2024-03-09T13:52:47.000Z","size":324,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-03-09T14:55:45.705Z","etag":null,"topics":["duplicate-detection","duplicates","gui","python","tkinter-python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/streanger.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-11-01T21:18:06.000Z","updated_at":"2024-03-09T14:55:45.705Z","dependencies_parsed_at":"2023-07-07T11:01:04.414Z","dependency_job_id":null,"html_url":"https://github.com/streanger/duplicate","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streanger%2Fduplicate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streanger%2Fduplicate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streanger%2Fduplicate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streanger%2Fduplicate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/streanger","download_url":"https://codeload.github.com/streanger/duplicate/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247166137,"owners_count":20894652,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["duplicate-detection","duplicates","gui","python","tkinter-python"],"created_at":"2025-04-04T10:52:57.098Z","updated_at":"2025-04-04T10:52:57.776Z","avatar_url":"https://github.com/streanger.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# duplicate\n\nfiles duplicate viewer\n\n## install\n\n```\npip install git+https://github.com/streanger/duplicate.git\n```\n\n## usage\n\n```bash\n# from cli\nduplicate  # with no args will run gui\nduplicate \u003cdirectory\u003e  # will run cli search\nduplicate . \".jpg, .png\"  # search in current directory, filter by .jpg, .png\n\n# as module\npython -m duplicate\n```\n\n```python\n# gui\nimport duplicate\nduplicate.gui()\n\n# console search\nfrom rich import print\nfrom duplicate import search\ndirectory = 'path/to/directory/with/duplicates'\nresults = search(directory)\nprint(results)\n```\n\n## screenshots\n\n![image](screenshots/duplicate01.png)\n\n## changelog\n\n- v. 0.1.5\n    - select file instead of opening directory\n    - reload extensions before search\n\n- v. 0.1.4\n    - default block size equals system page size (usually 4kB)\n\n- v. 0.1.3\n    - reading chunks of data using \"with\" statement. Only one file handle at the time is opened\n    - `keep searching...` while search lasts\n\n- v. 0.1.2\n    - fix for `OSError` using `gc.collect()` due to locked file handles after many search\n    - `__slots__` in `FileHash` class\n    - version info in gui and cli\n\n- v. 0.1.1\n    - initial filter and matching by files size\n    - files handle, which allows for reading files in chunks, what improves speed a lot\n    - entrypoint cli and gui support\n    \n- v. 0.1.0\n    - gui\n    - matching hashes of full file content\n\n## develop \u0026 debug\n\n```bash\n# general setup \u0026 tests\npython -m venv venv\nvenv/Scripts/Activate.ps1\npython -m duplicate\npython -m duplicate . \".py\"\npython -m duplicate --help\n\n# max recursion depth half-auto tests\npython .\\scripts\\create_files.py\npython -m duplicate scripts\n\n# duplicate.main\npython -m duplicate.main\n```\n\n## ideas\n\n- multithreaded file loading, to increase speed\n\n- different hashing algorithm\n\n- feature of breaking threads using `clear` button\n\n- dynamically pack labels - show only visible ones\n\n- faster search (+)\n\n- faster window moving while many rows exists\n\n- green progressbar for search\n\n- sync between search and gui\n\n- resizable filedialog (if possible)\n\n- after method in tkinter if needed\n\n- reset scrollbar for maximized windows should be fixed\n\n- more information on the bottom info label\n\n- better threads handling; maybe use of queue\n\n- utils module for class staticmethods\n\n- pylint \u0026 black todo\n\n- screenshot(s) to upload (+)\n\n- tests for maximum recursion depth\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstreanger%2Fduplicate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstreanger%2Fduplicate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstreanger%2Fduplicate/lists"}