{"id":15029760,"url":"https://github.com/brikerman/kashgari","last_synced_at":"2025-04-09T00:30:02.609Z","repository":{"id":37664076,"uuid":"166495086","full_name":"BrikerMan/Kashgari","owner":"BrikerMan","description":"Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.","archived":false,"fork":false,"pushed_at":"2024-09-03T21:05:29.000Z","size":15036,"stargazers_count":2391,"open_issues_count":28,"forks_count":441,"subscribers_count":64,"default_branch":"v2-main","last_synced_at":"2024-10-29T15:34:00.532Z","etag":null,"topics":["bert","bert-model","gpt-2","machine-learning","named-entity-recognition","ner","nlp","nlp-framework","seq2seq","sequence-labeling","text-classification","text-labeling","transfer-learning"],"latest_commit_sha":null,"homepage":"http://kashgari.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BrikerMan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"patreon":"brikerman"}},"created_at":"2019-01-19T01:53:28.000Z","updated_at":"2024-10-14T03:20:36.000Z","dependencies_parsed_at":"2024-11-09T04:22:41.462Z","dependency_job_id":"e89c20ef-33d8-4f6f-a3b7-4912521124ea","html_url":"https://github.com/BrikerMan/Kashgari","commit_stats":{"total_commits":848,"total_committers":23,"mean_commits":"36.869565217391305","dds":"0.12735849056603776","last_synced_commit":"ffe730d33f894e99a6fd7aa17ca67d161bf70359"},"previous_names":[],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrikerMan%2FKashgari","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrikerMan%2FKashgari/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrikerMan%2FKashgari/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BrikerMan%2FKashgari/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BrikerMan","download_url":"https://codeload.github.com/BrikerMan/Kashgari/tar.gz/refs/heads/v2-main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247949565,"owners_count":21023345,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","bert-model","gpt-2","machine-learning","named-entity-recognition","ner","nlp","nlp-framework","seq2seq","sequence-labeling","text-classification","text-labeling","transfer-learning"],"created_at":"2024-09-24T20:11:34.474Z","updated_at":"2025-04-09T00:30:02.589Z","avatar_url":"https://github.com/BrikerMan.png","language":"Python","funding_links":["https://patreon.com/brikerman"],"categories":[],"sub_categories":[],"readme":"\u003c!-- prettier-ignore-start --\u003e\n\u003c!-- markdownlint-disable --\u003e\n\u003ch1 align=\"center\"\u003e\n    \u003ca href='https://en.wikipedia.org/wiki/Mahmud_al-Kashgari'\u003eKashgari\u003c/a\u003e\n\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/BrikerMan/kashgari/blob/master/LICENSE\"\u003e\n        \u003cimg alt=\"GitHub\" src=\"https://img.shields.io/github/license/BrikerMan/kashgari.svg?color=blue\u0026style=popout\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://join.slack.com/t/kashgari/shared_invite/enQtODU4OTEzNDExNjUyLTY0MzI4MGFkZmRkY2VmMzdmZjRkZTYxMmMwNjMyOTI1NGE5YzQ2OTZkYzA1YWY0NTkyMDdlZGY5MGI5N2U4YzM\"\u003e\n        \u003cimg alt=\"Slack\" src=\"https://img.shields.io/badge/chat-Slack-blueviolet?logo=Slack\u0026style=popout\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://travis-ci.com/BrikerMan/Kashgari\"\u003e\n        \u003cimg src=\"https://travis-ci.com/BrikerMan/Kashgari.svg?branch=master\"/\u003e\n    \u003c/a\u003e\n    \u003ca href='https://coveralls.io/github/BrikerMan/Kashgari?branch=master'\u003e\n        \u003cimg src='https://coveralls.io/repos/github/BrikerMan/Kashgari/badge.svg?branch=master' alt='Coverage Status'/\u003e\n    \u003c/a\u003e\n     \u003ca href=\"https://pepy.tech/project/kashgari\"\u003e\n        \u003cimg src=\"https://pepy.tech/badge/kashgari\"/\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://pypi.org/project/kashgari/\"\u003e\n        \u003cimg alt=\"PyPI\" src=\"https://img.shields.io/pypi/v/kashgari.svg\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n\u003ch4 align=\"center\"\u003e\n    \u003ca href=\"#overview\"\u003eOverview\u003c/a\u003e |\n    \u003ca href=\"#performance\"\u003ePerformance\u003c/a\u003e |\n    \u003ca href=\"#installation\"\u003eInstallation\u003c/a\u003e |\n    \u003ca href=\"https://kashgari.readthedocs.io/\"\u003eDocumentation\u003c/a\u003e |\n    \u003ca href=\"https://kashgari.readthedocs.io/about/contributing/\"\u003eContributing\u003c/a\u003e\n\u003c/h4\u003e\n\n\u003c!-- markdownlint-enable --\u003e\n\u003c!-- prettier-ignore-end --\u003e\n\n🎉🎉🎉 We released the 2.0.0 version with TF2 Support. 🎉🎉🎉\n\nIf you use this project for your research, please cite:\n\n```\n@misc{Kashgari\n  author = {Eliyar Eziz},\n  title = {Kashgari},\n  year = {2019},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https://github.com/BrikerMan/Kashgari}}\n}\n```\n\n## Overview\n\nKashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.\n\n- **Human-friendly**. Kashgari's code is straightforward, well documented and tested, which makes it very easy to understand and modify.\n- **Powerful and simple**. Kashgari allows you to apply state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS) and classification.\n- **Built-in transfer learning**. Kashgari built-in pre-trained BERT and Word2vec embedding models, which makes it very simple to transfer learning to train your model.\n- **Fully scalable**. Kashgari provides a simple, fast, and scalable environment for fast experimentation, train your models and experiment with new approaches using different embeddings and model structure.\n- **Production Ready**. Kashgari could export model with `SavedModel` format for tensorflow serving, you could directly deploy it on the cloud.\n\n## Our Goal\n\n- **Academic users** Easier experimentation to prove their hypothesis without coding from scratch.\n- **NLP beginners** Learn how to build an NLP project with production level code quality.\n- **NLP developers** Build a production level classification/labeling model within minutes.\n\n## Performance\n\nWelcome to add performance report.\n\n| Task                       | Language | Dataset                     | Score |\n| -------------------------- | -------- | --------------------------- | ----- |\n| [Named Entity Recognition] | Chinese  | [People's Daily Ner Corpus] | 95.57 |\n| [Text Classification]      | Chinese  | [SMP2018ECDTCorpus]         | 94.57 |\n\n## Installation\n\nThe project is based on Python 3.6+, because it is 2019 and type hinting is cool.\n\n| Backend          | kashgari version                       | desc                  |\n| ---------------- | -------------------------------------- | --------------------- |\n| TensorFlow 2.2+  | `pip install 'kashgari\u003e=2.0.2'`        | TF2.10+ with tf.keras |\n| TensorFlow 1.14+ | `pip install 'kashgari\u003e=1.0.0,\u003c2.0.0'` | TF1.14+ with tf.keras |\n| Keras            | `pip install 'kashgari\u003c1.0.0'`         | keras version         |\n\nYou also need to install `tensorflow_addons` with TensorFlow.\n\n| TensorFlow Version       | tensorflow_addons version               |\n| ------------------------ | --------------------------------------- |\n| TensorFlow 2.1           | `pip install tensorflow_addons==0.9.1`  |\n| TensorFlow 2.2           | `pip install tensorflow_addons==0.11.2` |\n| TensorFlow 2.3, 2.4, 2.5 | `pip install tensorflow_addons==0.13.0` |\n\n## Tutorials\n\nHere is a set of quick tutorials to get you started with the library:\n\n- [Tutorial 1: Text Classification](./docs/tutorial/text-classification.md)\n- [Tutorial 2: Text Labeling](./docs/tutorial/text-labeling.md)\n- [Tutorial 3: Seq2Seq](./docs/tutorial/seq2seq.md)\n- [Tutorial 4: Language Embedding](./docs/embeddings/index.md)\n\nThere are also articles and posts that illustrate how to use Kashgari:\n\n- [基于 Kashgari 2 的短文本分类: 数据分析和预处理](https://eliyar.biz/short_text_classificaion_with_kashgari_v2_part_1/index.html)\n- [基于 Kashgari 2 的短文本分类: 训练模型和调优](https://eliyar.biz/nlp/short_text_classificaion_with_kashgari_v2_part_2/index.html)\n- [基于 Kashgari 2 的短文本分类: 模型部署](https://eliyar.biz/nlp/short_text_classificaion_with_kashgari_v2_part_3/index.html)\n- [15 分钟搭建中文文本分类模型](https://eliyar.biz/nlp_chinese_text_classification_in_15mins/)\n- [基于 BERT 的中文命名实体识别（NER)](https://eliyar.biz/nlp_chinese_bert_ner/)\n- [BERT/ERNIE 文本分类和部署](https://eliyar.biz/nlp_train_and_deploy_bert_text_classification/)\n- [五分钟搭建一个基于BERT的NER模型](https://www.jianshu.com/p/1d6689851622)\n- [Multi-Class Text Classification with Kashgari in 15 minutes](https://medium.com/@BrikerMan/multi-class-text-classification-with-kashgari-in-15mins-c3e744ce971d)\n\nExamples:\n\n- [Neural machine translation with Seq2Seq](./examples/translate_with_seq2seq.ipynb)\n\n## Contributors ✨\n\nThanks goes to these wonderful people. And there are many ways to get involved.\nStart with the [contributor guidelines](./docs/about/contributing.md) and then check these open issues for specific tasks.\n\n[Named Entity Recognition]: /tutorial/text-labeling/#chinese-ner-performance\n[People's Daily Ner Corpus]: /apis/corpus/#kashgari.corpus.ChineseDailyNerCorpus\n[Text Classification]: /tutorial/text-classification/#short-sentence-classification-performance\n[SMP2018ECDTCorpus]: /apis/corpus/#kashgari.corpus.SMP2018ECDTCorpus\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrikerman%2Fkashgari","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrikerman%2Fkashgari","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrikerman%2Fkashgari/lists"}