{"id":13366839,"url":"https://github.com/abadojack/whatlangGo","last_synced_at":"2025-03-12T18:31:33.644Z","repository":{"id":48412423,"uuid":"82584355","full_name":"abadojack/whatlanggo","owner":"abadojack","description":"Natural language detection library for Go","archived":false,"fork":false,"pushed_at":"2023-03-28T08:08:05.000Z","size":246,"stargazers_count":640,"open_issues_count":13,"forks_count":66,"subscribers_count":15,"default_branch":"master","last_synced_at":"2024-10-25T05:24:22.061Z","etag":null,"topics":["go","language","nlp","text-processing"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/abadojack.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":"SUPPORTED_LANGUAGES.md","governance":null,"roadmap":null,"authors":null}},"created_at":"2017-02-20T17:32:01.000Z","updated_at":"2024-10-21T18:34:02.000Z","dependencies_parsed_at":"2024-01-08T15:34:45.086Z","dependency_job_id":null,"html_url":"https://github.com/abadojack/whatlanggo","commit_stats":{"total_commits":29,"total_committers":7,"mean_commits":4.142857142857143,"dds":0.3448275862068966,"last_synced_commit":"9a096a12270b527608792719d6e75e68a8bbfb03"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abadojack%2Fwhatlanggo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abadojack%2Fwhatlanggo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abadojack%2Fwhatlanggo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/abadojack%2Fwhatlanggo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/abadojack","download_url":"https://codeload.github.com/abadojack/whatlanggo/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243271325,"owners_count":20264437,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["go","language","nlp","text-processing"],"created_at":"2024-07-30T00:01:32.824Z","updated_at":"2025-03-12T18:31:33.348Z","avatar_url":"https://github.com/abadojack.png","language":"Go","funding_links":[],"categories":["自然语言处理","自然語言處理"],"sub_categories":["高级控制台界面","高級控制台界面"],"readme":"# Whatlanggo\n\n[![Build Status](https://travis-ci.org/abadojack/whatlanggo.svg?branch=master)](https://travis-ci.org/abadojack/whatlanggo)  [![Go Report Card](https://goreportcard.com/badge/github.com/abadojack/whatlanggo)](https://goreportcard.com/report/github.com/abadojack/whatlanggo)  [![GoDoc](https://godoc.org/github.com/abadojack/whatlanggo?status.png)](https://godoc.org/github.com/abadojack/whatlanggo) [![Coverage Status](https://coveralls.io/repos/github/abadojack/whatlanggo/badge.svg)](https://coveralls.io/github/abadojack/whatlanggo)\n\nNatural language detection for Go.\n## Features\n* Supports [84 languages](https://github.com/abadojack/whatlanggo/blob/master/SUPPORTED_LANGUAGES.md)\n* 100% written in Go\n* No external dependencies\n* Fast\n* Recognizes not only a language, but also a script (Latin, Cyrillic, etc)\n\n## Getting started\nInstallation:\n```sh\n    go get -u github.com/abadojack/whatlanggo\n```\n\nSimple usage example:\n```go\npackage main\n\nimport (\n\t\"fmt\"\n\n\t\"github.com/abadojack/whatlanggo\"\n)\n\nfunc main() {\n\tinfo := whatlanggo.Detect(\"Foje funkcias kaj foje ne funkcias\")\n\tfmt.Println(\"Language:\", info.Lang.String(), \" Script:\", whatlanggo.Scripts[info.Script], \" Confidence: \", info.Confidence)\n}\n```\n\n## Blacklisting and whitelisting\n```go\npackage main\n\nimport (\n\t\"fmt\"\n\n\t\"github.com/abadojack/whatlanggo\"\n)\n\nfunc main() {\n\t//Blacklist\n\toptions := whatlanggo.Options{\n\t\tBlacklist: map[whatlanggo.Lang]bool{\n\t\t\twhatlanggo.Ydd: true,\n\t\t},\n\t}\n\n\tinfo := whatlanggo.DetectWithOptions(\"האקדמיה ללשון העברית\", options)\n\n\tfmt.Println(\"Language:\", info.Lang.String(), \"Script:\", whatlanggo.Scripts[info.Script])\n\n\t//Whitelist\n\toptions1 := whatlanggo.Options{\n\t\tWhitelist: map[whatlanggo.Lang]bool{\n\t\t\twhatlanggo.Epo: true,\n\t\t\twhatlanggo.Ukr: true,\n\t\t},\n\t}\n\n\tinfo = whatlanggo.DetectWithOptions(\"Mi ne scias\", options1)\n\tfmt.Println(\"Language:\", info.Lang.String(), \" Script:\", whatlanggo.Scripts[info.Script])\n}\n```\nFor more details, please check the [documentation](https://godoc.org/github.com/abadojack/whatlanggo).\n\n## Requirements\nGo 1.8 or higher\n\n## How does it work?\n\n### How does the language recognition work?\n\nThe algorithm is based on the trigram language models, which is a particular case of n-grams.\nTo understand the idea, please check the original whitepaper [Cavnar and Trenkle '94: N-Gram-Based Text Categorization'](https://www.researchgate.net/publication/2375544_N-Gram-Based_Text_Categorization).\n\n### How _IsReliable_ calculated?\n\nIt is based on the following factors:\n* How many unique trigrams are in the given text\n* How big is the difference between the first and the second(not returned) detected languages? This metric is called `rate` in the code base.\n\nTherefore, it can be presented as 2d space with threshold functions, that splits it into \"Reliable\" and \"Not reliable\" areas.\nThis function is a hyperbola and it looks like the following one:\n\n\u003cimg alt=\"Language recognition whatlang rust\" src=\"https://raw.githubusercontent.com/abadojack/whatlanggo/master/images/whatlang_is_reliable.png\" width=\"450\" height=\"300\" /\u003e\n\nFor more details, please check a blog article [Introduction to Rust Whatlang Library and Natural Language Identification Algorithms](https://www.greyblake.com/blog/2017-07-30-introduction-to-rust-whatlang-library-and-natural-language-identification-algorithms/).\n\n## License\n[MIT](https://github.com/abadojack/whatlanggo/blob/master/LICENSE)\n\n## Derivation\nwhatlanggo is a derivative of [Franc](https://github.com/wooorm/franc) (JavaScript, MIT) by [Titus Wormer](https://github.com/wooorm).\n\n## Acknowledgements\nThanks to [greyblake](https://github.com/greyblake) (Potapov Sergey) for creating [whatlang-rs](https://github.com/greyblake/whatlang-rs) from where I got the idea and algorithms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabadojack%2FwhatlangGo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fabadojack%2FwhatlangGo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fabadojack%2FwhatlangGo/lists"}