https://github.com/crownpku/Small-Chinese-Corpus
Some useful Chinese corpus datasets 中文语料小数据
https://github.com/crownpku/Small-Chinese-Corpus
chinese-nlp corpus
Last synced: 5 months ago
JSON representation
Some useful Chinese corpus datasets 中文语料小数据
- Host: GitHub
- URL: https://github.com/crownpku/Small-Chinese-Corpus
- Owner: crownpku
- Created: 2016-08-26T06:57:40.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2020-03-29T08:43:35.000Z (about 6 years ago)
- Last Synced: 2025-07-20T07:33:00.391Z (9 months ago)
- Topics: chinese-nlp, corpus
- Homepage:
- Size: 92.4 MB
- Stars: 535
- Watchers: 31
- Forks: 165
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
- awesome-nlp-chinese-corpus - Small-Chinese-Corpus
- awesome-deeplearning-resources - 中文语料小数据:Some useful Chinese corpus datasets
- awesome-chinese-nlp - 中文语料小数据
README
# 中文语料小数据:Some useful Chinese corpus datasets
* 中国省市经纬度坐标:city_location/
* 中国省市邮政编码大全:postal_provinces/
* 全国区划和城乡划分代码(2015):china_geo_code/
* 成语大全:chengyu/
* 中文人名大全及金庸小说、三国演义及红楼梦人物姓名:chi_names/
* 中文命名实体识别数据sample:NER_chi/
* 中文关系识别数据sample:relation_multiple_chi/
* 中文阅读理解数据sample:reading_comprehension_chi/
* 中文图文问答数据(基于MSCOCO):Chinese_Visual_QA_pairs/