{"id":24084110,"url":"https://github.com/tos-kamiya/dendro_text","last_synced_at":"2026-02-23T01:03:51.949Z","repository":{"id":58868258,"uuid":"192153550","full_name":"tos-kamiya/dendro_text","owner":"tos-kamiya","description":"Draw dendrogram of similarity between text files.","archived":false,"fork":false,"pushed_at":"2024-10-17T16:42:59.000Z","size":219,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-06T04:40:12.308Z","etag":null,"topics":["cli-tool","dendrogram","nlp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tos-kamiya.png","metadata":{"files":{"readme":"README-pypi.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["tos-kamiya"],"custom":["https://www.amazon.jp/hz/wishlist/ls/3EMPATGCODYA3?ref_=wl_share"]}},"created_at":"2019-06-16T05:24:42.000Z","updated_at":"2024-10-17T16:38:25.000Z","dependencies_parsed_at":"2025-04-30T19:43:56.442Z","dependency_job_id":"ec054b3e-1895-45ee-942f-5a20f210e833","html_url":"https://github.com/tos-kamiya/dendro_text","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/tos-kamiya/dendro_text","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tos-kamiya%2Fdendro_text","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tos-kamiya%2Fdendro_text/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tos-kamiya%2Fdendro_text/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tos-kamiya%2Fdendro_text/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tos-kamiya","download_url":"https://codeload.github.com/tos-kamiya/dendro_text/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tos-kamiya%2Fdendro_text/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274727685,"owners_count":25338400,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-11T02:00:13.660Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli-tool","dendrogram","nlp"],"created_at":"2025-01-10T00:19:25.732Z","updated_at":"2025-10-23T21:39:03.957Z","avatar_url":"https://github.com/tos-kamiya.png","language":"Python","funding_links":["https://github.com/sponsors/tos-kamiya","https://www.amazon.jp/hz/wishlist/ls/3EMPATGCODYA3?ref_=wl_share"],"categories":[],"sub_categories":[],"readme":"[![Tests](https://github.com/tos-kamiya/dendro_text/actions/workflows/tests.yaml/badge.svg)](https://github.com/tos-kamiya/dendro_text/actions/workflows/tests.yaml)\n\ndendro_text\n===========\n\nDraw a dendrogram of similarity between text files.\n\nThe similarity is measured in terms of **Damerau-Levenshtein edit distance**.\nThe distance between given two texts is a count of inserted, deleted, and substituted characters required to modify one text to the other.\nA smaller value means that the two texts are more similar.\n\nFeatures:\n\n* **Parallel execution**: Supports execution on multiple CPU cores. Plus, jit compilation by Numba (v1.6+).\n\n* **Options in tokenization**: By default, the text is compared with a sequence of words extracted by splitting input text into different character types. Optionally, you can compare texts line by line, character by character, or token by token as extracted with lexical analyzers of programming languages.\n\n* **File-centric search**: A function to list files in order of similarity to a given file.\n\n* **Diff (Experimental)**: Diff functionality to show textual differences between files. (This function is provided to check for differences in similarity calculations depending on tokenization.)\n\n## Installation\n\n```sh\npipx install dendro-text\n```\n\nTo uninstall,\n\n```sh\npipx uninstall dendro-text\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftos-kamiya%2Fdendro_text","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftos-kamiya%2Fdendro_text","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftos-kamiya%2Fdendro_text/lists"}