https://github.com/howl-anderson/corpus_dataset_for_chinese_nlp
中文 NLP 语料库数据集
https://github.com/howl-anderson/corpus_dataset_for_chinese_nlp
Last synced: 7 months ago
JSON representation
中文 NLP 语料库数据集
- Host: GitHub
- URL: https://github.com/howl-anderson/corpus_dataset_for_chinese_nlp
- Owner: howl-anderson
- License: mit
- Created: 2018-04-25T15:59:22.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2018-12-14T16:44:54.000Z (almost 7 years ago)
- Last Synced: 2025-01-21T21:47:08.260Z (9 months ago)
- Homepage:
- Size: 3.91 KB
- Stars: 20
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# corpus_dataset_for_Chinese_NLP
## Academic institution provided
### Fudan Natural Language Processing Group
URL: http://nlp.fudan.edu.cn/
* [Chinese Word Segmentation and POS Tagging for Micro-Blog Texts](http://nlp.fudan.edu.cn/data/)
* [Multi-task Learning for Text Classification](http://nlp.fudan.edu.cn/data/)
* [Neural Sentence Ordering](http://nlp.fudan.edu.cn/data/)### NLP and Big Data Research Group in the ISTD pillar at the Singapore University of Technology and Design
URL: http://www.statnlp.org/software/dataset
* Multilingual Geoquery
* MalwareTextDB
* Multilingual ATIS
* NP-annotated SMS dataset### THUOCL:清华大学开放中文词库
URL: http://thuocl.thunlp.org/### “学堂在线”课程中文分词和词性标注语料库
URL: http://nlp.csai.tsinghua.edu.cn/site2/index.php/en/resources/195-xuetangxccorpus1-0