{"id":20238601,"url":"https://github.com/cadmiumcr/language_detector","last_synced_at":"2025-08-24T14:16:58.859Z","repository":{"id":82833199,"uuid":"204344749","full_name":"cadmiumcr/language_detector","owner":"cadmiumcr","description":"Detects the language of a text sample","archived":false,"fork":false,"pushed_at":"2020-05-20T16:56:39.000Z","size":400,"stargazers_count":7,"open_issues_count":4,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-24T17:13:47.785Z","etag":null,"topics":["crystal","crystal-lang","language-detection"],"latest_commit_sha":null,"homepage":"","language":"Crystal","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cadmiumcr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-25T20:12:18.000Z","updated_at":"2023-04-29T05:56:43.000Z","dependencies_parsed_at":"2023-03-04T19:45:45.748Z","dependency_job_id":null,"html_url":"https://github.com/cadmiumcr/language_detector","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cadmiumcr%2Flanguage_detector","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cadmiumcr%2Flanguage_detector/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cadmiumcr%2Flanguage_detector/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cadmiumcr%2Flanguage_detector/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cadmiumcr","download_url":"https://codeload.github.com/cadmiumcr/language_detector/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248281422,"owners_count":21077423,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crystal","crystal-lang","language-detection"],"created_at":"2024-11-14T08:34:54.435Z","updated_at":"2025-04-10T19:35:40.843Z","avatar_url":"https://github.com/cadmiumcr.png","language":"Crystal","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Language Detector\n\n![](https://github.com/cadmiumcr/language_detector/workflows/language_detector/badge.svg)\n\nCrystal port of [franc](https://github.com/wooorm/franc).\n\nIt's not the state-of-the-art algorithm on language identification, but gets 90%+ success on long enough text samples.\n\nIt supports 400+ languages.\n\nIt identifies any given text sample by extracting its 3 characters trigrams and comparing them to the most recurring trigrams extracted from a translation of the [UDHR](https://www.un.org/en/universal-declaration-human-rights/) in all the available languages.\n\nLanguage Detector returns the ISO-869-1 two letters language code of the most probable guess.\n\n## Installation\n\n1. Add the dependency to your `shard.yml`:\n\n   ```yaml\n   dependencies:\n     cadmium_language_detector:\n       github: cadmiumcr/language_detector\n   ```\n\n2. Run `shards install`\n\n## Usage\n\n```crystal\nrequire \"cadmium_language_detector\"\n\ntext = \"Alice was published in 1865, three years after Charles Lutwidge Dodgson and the Reverend Robinson Duckworth rowed in a\nboat, on 4 July 1862 [4] (this popular date of the golden afternoon [5] might be a confusion or even another Alice-tale, for that\nparticular day was cool, cloudy and rainy [6] ), up the Isis with the three young daughters of Henry Liddell (the Vice-Chancellor ofOxford University and Dean of Christ Church): Lorina Charlotte Liddell (aged\n13, born 1849) (Prima in the book's prefatory verse); Alice Pleasance Liddell\n(aged 10, born 1852) (Secunda in the prefatory verse); Edith Mary Liddell\n(aged 8, born 1853) (Tertia in the prefatory verse). [7]\nThe journey began at Folly Bridge near Oxford and ended five miles away in the\nvillage of Godstow. During the trip Charles Dodgson told the girls a story that\nfeatured a bored little girl named Alice who goes looking for an adventure. The\ngirls loved it, and Alice Liddell asked Dodgson to write it down for her. He\nbegan writing the manuscript of the story the next day, although that earliest\nversion no longer exists. The girls and Dodgson took another boat trip a month\nlater when he elaborated the plot to the story of Alice, and in November he\nbegan working on the manuscript in earnest.\"\n\npp LanguageDetector.new.detect(text) # =\u003e \"en\"\n\n```\n\n\n\n## Contributing\n\n1. Fork it (\u003chttps://github.com/cadmiumcr/language_detector/fork\u003e)\n2. Create your feature branch (`git checkout -b my-new-feature`)\n3. Commit your changes (`git commit -am 'Add some feature'`)\n4. Push to the branch (`git push origin my-new-feature`)\n5. Create a new Pull Request\n\n## Contributors\n\n- [Rémy Marronnier](https://github.com/rmarronnier) - creator and maintainer\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcadmiumcr%2Flanguage_detector","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcadmiumcr%2Flanguage_detector","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcadmiumcr%2Flanguage_detector/lists"}