Projects in Awesome Lists tagged with chinese-word-segmentation
A curated list of projects in awesome lists tagged with chinese-word-segmentation .
https://github.com/embedding/chinese-word-vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings
Last synced: 10 Apr 2025
https://github.com/Embedding/Chinese-Word-Vectors
100+ Chinese Word Vectors 上百种预训练中文词向量
chinese chinese-word-segmentation embedding embeddings vectors-trained word-embeddings
Last synced: 26 Mar 2025
https://github.com/lancopku/pkuseg-python
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
Last synced: 09 Apr 2025
https://github.com/baidu/lac
百度NLP:分词,词性标注,命名实体识别,词重要性
chinese-nlp chinese-word-segmentation java lexical-analysis named-entity-recognition part-of-speech-tagger python word-segmentation
Last synced: 11 Apr 2025
https://github.com/ownthink/Jiagu
Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类
chinese-word-segmentation cws ner nlp pos
Last synced: 09 Apr 2025
https://github.com/ownthink/jiagu
Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类
chinese-word-segmentation cws ner nlp pos
Last synced: 12 Apr 2025
https://github.com/wolfgarbe/SymSpell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
approximate-string-matching chinese-text-segmentation chinese-word-segmentation damerau-levenshtein edit-distance fuzzy-matching fuzzy-search levenshtein levenshtein-distance spell-check spellcheck spelling spelling-correction symspell text-segmentation word-segmentation
Last synced: 13 Mar 2025
https://github.com/wolfgarbe/symspell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
approximate-string-matching chinese-text-segmentation chinese-word-segmentation damerau-levenshtein edit-distance fuzzy-matching fuzzy-search levenshtein levenshtein-distance spell-check spellcheck spelling spelling-correction symspell text-segmentation word-segmentation
Last synced: 20 Nov 2024
https://github.com/didi/ChineseNLP
Datasets, SOTA results of every fields of Chinese NLP
chinese-nlp chinese-word-segmentation entity-linking machine-translation nlp nlp-tasks question-answering
Last synced: 09 Apr 2025
https://github.com/didi/chinesenlp
Datasets, SOTA results of every fields of Chinese NLP
chinese-nlp chinese-word-segmentation entity-linking machine-translation nlp nlp-tasks question-answering
Last synced: 05 Apr 2025
https://github.com/lionsoul2014/jcseg
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch
chinese-nlp chinese-text-segmentation chinese-word-segmentation elasticsearch-analyzer elasticsearch-tokenizer java jcseg jcseg-analyzer keywords-extraction lucene-analyzer lucene-tokenizer mmseg natural-language-processing nlp nlp-keywords-extraction opensearch-analyzer opensearch-tokenizer pos-tagging solr-plugin
Last synced: 07 Apr 2025
https://github.com/mammothb/symspellpy
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
approximate-string-matching chinese-text-segmentation chinese-word-segmentation damerau-levenshtein edit-distance fuzzy-matching fuzzy-search levenshtein levenshtein-distance python spell-check spellcheck spelling spelling-correction symspell text-segmentation word-segmentation
Last synced: 27 Nov 2024
https://github.com/messense/jieba-rs
The Jieba Chinese Word Segmentation Implemented in Rust
chinese-word-segmentation jieba jieba-chinese nlp wasm
Last synced: 10 Apr 2025
https://github.com/lionsoul2014/friso
High performance Chinese tokenizer with both GBK and UTF-8 charset support based on MMSEG algorithm developed by ANSI C. Completely based on modular implementation and can be easily embedded in other programs, like: MySQL, PostgreSQL, PHP, etc.
c chinese-tokenizer chinese-word-segmentation cjk-tokenizer full-text-search japanese-tokenizer korean-tokenizer php-tokenizer tokenizer
Last synced: 05 Apr 2025
https://github.com/kyubyong/g2pc
g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese
chinese-nlp chinese-word-segmentation crf crfsuite g2p pinyin
Last synced: 07 Apr 2025
https://github.com/supercoderhawk/deeplearning_nlp
基于深度学习的自然语言处理库
chinese-tokenizer chinese-word-segmentation deep-learning named-entity-recognition natural-language-processing relation-extraction tensorflow
Last synced: 15 Apr 2025
https://github.com/howl-anderson/microtokenizer
一个轻量且功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a practical, hands-on approach to understanding NLP concepts, featuring multiple tokenization algorithms and customizable models. Ideal for students, researchers, and NLP enthusiasts..
chinese-nlp chinese-tokenizer chinese-word-segmentation dag-network educational-project nlp-machine-learning tokenizer
Last synced: 12 Apr 2025
https://github.com/nlpir-team/nlpir-analysis-cn-ictclas
Lucene/Solr Analyzer Plugin. Support MacOS,Linux x86/64,Windows x86/64. It's a maven project, which allows you change the lucene/solr version. //Maven工程,修改Lucene/Solr版本,以兼容相应版本。
chinese-word-segmentation ictclas lucene lucene-analyzer nlpir solr
Last synced: 11 Apr 2025
https://github.com/supercoderhawk/dnn_cws
利用深度学习实现中文分词
chinese-text-segmentation chinese-word-segmentation deep-learning tensorflow
Last synced: 15 Apr 2025
https://github.com/supercoderhawk/deepnlp
基于深度学习的自然语言处理库
chinese-word-segmentation deep-learning named-entity-recognition natural-language-processing tensorflow
Last synced: 15 Apr 2025
https://github.com/hankcs/sub-character-cws
Sub-Character Representation Learning
chinese-word-segmentation cws natural-language-processing nlp representation-learning simplified-chinese traditional-chinese
Last synced: 11 Apr 2025
https://github.com/nlpir-team/nlpir-ictclas
The Java Package of NLPIR-ICTCLAS.
chinese-word-segmentation ictclas nlpir
Last synced: 11 Apr 2025
https://github.com/oscarsun72/textforctext
為了《中國哲學書電子化計劃》輸入用-加速鍵入與排版,更好的輸入體驗+文房一寶勝四寶WordVBA文史工具-中文博士寫程式
characters chinese chinese-characters chinese-language chinese-text-segmentation chinese-traditional chinese-word-segmentation chrome chromedriver ctext ocr selenium selenium-webdriver sinology text text-content text-editor vba vba-macros vba-word
Last synced: 22 Nov 2024
https://github.com/bububa/jiagu
Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类
chinese-nlp chinese-word-segmentation classification clustering cws ner nlp pos segmentation
Last synced: 15 Apr 2025
https://github.com/Hoiy/berserker
Berserker - BERt chineSE woRd toKenizER
bert bert-chinese chinese-nlp chinese-word-segmentation nlp sequence-to-sequence state-of-the-art tensorflow tokenizer tpu
Last synced: 02 Apr 2025
https://github.com/messense/cjieba-py
Python cffi binding to CppJieba
cffi chinese-word-segmentation jieba jieba-chinese python-bindings word-segmentation
Last synced: 15 Apr 2025
https://github.com/fumiama/jieba
Jiebago 的性能优化版, 支持从 io.Reader 加载字典
chinese chinese-characters chinese-language chinese-text-segmentation chinese-word-segmentation golang golang-library golang-package jieba jieba-analysis jieba-chinese
Last synced: 09 Apr 2025
https://github.com/ganjinzero/gts
Code for Unsupervised multi-granular Chinese word segmentation and term discovery via graph partition [JBI]
chinese-word-segmentation graph-cut unsupervised
Last synced: 22 Nov 2024
https://github.com/riccorl/chinese-word-segmentation-pytorch
Chinese Word Segmentation task based on BERT and implemented in Pytorch
bert bert-embeddings chinese chinese-word-segmentation classification cws deep-learning embeddings neural-network nlp pytorch segmentation token transformer
Last synced: 14 Apr 2025
https://github.com/dalinvip/pytorch_chinese_word_segmentation
Chinese word segmentation with the neural seq2seq model implement in pytorch
chinese-word-segmentation pytorch seq2seq
Last synced: 22 Apr 2025
https://github.com/windomz/gcws
gcws is CWS(Chinese Word Segmentation) for golang - 一个开源中文分词集成
Last synced: 09 Apr 2025
https://github.com/ailln/simple-jieba
✂️用 100 行实现简单版本的 jieba 分词
chinese-word-segmentation jieba jieba-chinese word-segmentation
Last synced: 18 Nov 2024
https://github.com/nlpir-team/elasticsearch-analysis-ictclas
Elasticsearch analysis plugin of ICTCLAS
chinese-word-segmentation elasticsearch-analysis elasticsearch-plugin ictclas
Last synced: 11 Apr 2025
https://github.com/dalinvip/pytorch_joint-word-segmentation-and-pos-tagging-old
pytorch_seq2seq_wordseg_and_postag
chinese-word-segmentation pos-tagging pytorch seq2seq seq2seq-batch
Last synced: 22 Mar 2025
https://github.com/dalinvip/pytorch_seq2seq_wordseg_and_postag_version2
pytorch_seq2seq_wordseg_and_postag_version2
chinese-word-segmentation pos-tagging pytorch seq2seq seq2seq-batch
Last synced: 22 Mar 2025
https://github.com/kemingy/handict
chinese-word-segmentation mmseg tokenization tokenizer
Last synced: 05 Apr 2025
https://github.com/messense/cppjieba-cabi
Idiomatic C ABI for CppJieba
chinese-nlp chinese-segmenter chinese-word-segmentation
Last synced: 23 Apr 2025
https://github.com/usaoc/chissor
GUI application for Chinese word segmentation
chinese-word-segmentation egui
Last synced: 12 Apr 2025
https://github.com/guopeiming/nnsegmentor
Undergraduate graduation project ---- Chinese Word Segmentation for Weibo text
chinese-word-segmentation deep-learning nlp
Last synced: 01 Mar 2025
https://github.com/dhchenx/ner-kit
A toolkit for simple NLP APIs based on Stanza
chinese-word-segmentation language-detection named-entity-recognition natural-language-processing ner-kit pos-tagging sentiment-analysis text-analysis
Last synced: 23 Mar 2025