{"id":16883698,"url":"https://github.com/cyb3rk0tik/pyfranc","last_synced_at":"2025-10-25T01:20:23.071Z","repository":{"id":40545952,"uuid":"418923016","full_name":"cyb3rk0tik/pyfranc","owner":"cyb3rk0tik","description":"Text language detection basic on trigrams.","archived":false,"fork":false,"pushed_at":"2023-10-02T16:59:44.000Z","size":391,"stargazers_count":14,"open_issues_count":0,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-06-28T16:18:13.685Z","etag":null,"topics":["classification","cli","detect","detection","language","language-detection","library","natural-language","python","trigrams"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cyb3rk0tik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-19T12:52:08.000Z","updated_at":"2025-06-17T10:05:43.000Z","dependencies_parsed_at":"2024-10-21T18:34:57.126Z","dependency_job_id":null,"html_url":"https://github.com/cyb3rk0tik/pyfranc","commit_stats":{"total_commits":3,"total_committers":2,"mean_commits":1.5,"dds":"0.33333333333333337","last_synced_commit":"b931af847e4a90e3219c90466c9a949eb2cdf2bc"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cyb3rk0tik/pyfranc","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyb3rk0tik%2Fpyfranc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyb3rk0tik%2Fpyfranc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyb3rk0tik%2Fpyfranc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyb3rk0tik%2Fpyfranc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cyb3rk0tik","download_url":"https://codeload.github.com/cyb3rk0tik/pyfranc/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyb3rk0tik%2Fpyfranc/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266507366,"owners_count":23940055,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","cli","detect","detection","language","language-detection","library","natural-language","python","trigrams"],"created_at":"2024-10-13T16:13:56.422Z","updated_at":"2025-10-25T01:20:22.994Z","avatar_url":"https://github.com/cyb3rk0tik.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Pyfranc\nText language detection basic on trigrams.\nSupport [414](https://github.com/wooorm/franc/blob/main/packages/franc-all/readme.md#support) language from [franc-all](https://github.com/wooorm/franc/tree/main/packages/franc-all)\n\n## Install\n\nThis package is tested in Python 3.8, but should work on the whole 3rd revision of Python.\n\n[pip](https://pip.pypa.io/en/stable/installation/):\n\n```python\npip install pyfranc\n```\n\n## Use\n### How library\n\n```python\nfrom pyfranc import franc\n\nfranc.lang_detect('Alle menslike wesens word vry')[0][0] # 'afr'\nfranc.lang_detect('এটি একটি ভাষা একক IBM স্ক্রিপ্ট')[0][0]  # 'ben'\nfranc.lang_detect('Alle menneske er fødde til fridom')[0][0] # 'nno'\nfranc.lang_detect('')[0][0] # 'und'\n\n# You can change what’s too short (default: 10):\nfranc.lang_detect('the')[0][0] # 'und'\nfranc.lang_detect('the', minlength=3)[0][0] # 'sco'\n\n[0][0] has taken first value (iso code lang) in first element in output array.\n```\n\n#### `whitelist`\n\n```python\nfranc.lang_detect('Considerando ser essencial que os direitos humanos', whitelist = ['por', 'spa'])\n# [['por', 1], ['spa', 0.6034146900423971]]\n```\n\n#### `blacklist`\n\n```python\nfranc.lang_detect('Considerando ser essencial que os direitos humanos', blacklist = ['src', 'glg'])\n#[['por', 1],\n# ['ina', 0.6211756617394293], \n# ['spa', 0.6034146900423971], \n# ['ast', 0.5628509224246592], \n# ['oci', 0.5583820327718574],\n# ... 317 more items]\n```\n\n### How CLI\n\n```\nusage: pyfranc_cli [-h] [-v] [-s STRING] [-t TOP] [-m MINLENGTH] [-w [WHITELIST [WHITELIST ...]]]\n                   [-b [BLACKLIST [BLACKLIST ...]]] [-a] [-f] [-p]\n\nCLI to detect the language of text.\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -v, --version         Print version number.\n  -s STRING, --string STRING\n                        Input string.\n  -t TOP, --top TOP     Print top results.\n  -m MINLENGTH, --minlength MINLENGTH\n                        Minimum string length to accept.\n  -w [WHITELIST [WHITELIST ...]], --whitelist [WHITELIST [WHITELIST ...]], \n  -o [WHITELIST [WHITELIST ...]], --only [WHITELIST [WHITELIST ...]]\n                        Allow languages.\n  -b [BLACKLIST [BLACKLIST ...]], --blacklist [BLACKLIST [BLACKLIST ...]], \n  -i [BLACKLIST [BLACKLIST ...]], --ignore [BLACKLIST [BLACKLIST ...]]\n                        Disallow languages.\n  -a, --all             Output all raw results.\n  -f, --full            Print full name of language (with lang code).\n  -p, --percentage      Print relative match value (in percent).\n```\n\t\t\t\t\n`usage:`\n```\n# output language\n$ pyfranc_cli -t 1 -s \"Alle menslike wesens word vry\"\n# 'afr' : 1.0\n\n# output language from stdin (expects utf8)\n$ echo \"এটি একটি ভাষা একক IBM স্ক্রিপ্ট\" | pyfranc_cli -t 1 -s $0\n# 'ben' : 1.0\n\n# ignore certain languages\n$ pyfranc_cli --blacklist por glg -s \"O Brasil caiu 26 posições\"\n# 'vec' : 1.0\n\n# output language from stdin with only\n$ echo \"Alle mennesker er født frie og\" | pyfranc_cli -t 1 --whitelist nob dan -s $0\n# 'nob' : 1.0'\n\n# output all results in raw-list format\n$ pyfranc_cli --all -s \"Considerando ser essencial que os direitos humanos\"\n# [['por', 1.0], ['glg', 0.771284519307895], ... 320 more items]\n\n# display the result language name\n$ pyfranc_cli --full -t 1 -s \"Alle menslike wesens word vry\"\n# Afrikaans (afr) : 1.0\n\n# output result with relative percentage of value\n$ pyfranc_cli -t 5 --percentage -s \"Considerando ser essencial que os direitos humanos\"\n# por : 28%\n# glg : 22%\n# ina : 17%\n# spa : 17%\n# ast : 16%\n```\n\n## Derivation\n\nPyfranc is a outright port from [Franc](https://github.com/wooorm/franc) (JavaScript, MIT), \n[trigram-utils](https://github.com/wooorm/trigram-utils) (JavaScript, MIT),  [collapse-white-space](https://github.com/wooorm/collapse-white-space)\n(JavaScript, MIT), and [n-gram](https://github.com/words/n-gram) (JavaScript, MIT). \nAll this by [Titus Wormer](https://github.com/wooorm).\n\n## License\n\n[MIT](https://github.com/cyb3rk0tik/pyfranc/blob/master/LICENSE) © [cyb3rk0tik](https://github.com/cyb3rk0tik)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyb3rk0tik%2Fpyfranc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcyb3rk0tik%2Fpyfranc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyb3rk0tik%2Fpyfranc/lists"}