https://github.com/rutopio/reduplication-dictionary-mandarin-minnan-hakka-jyut
Reduplications of Chinese characters (Mandarin, Minnan, Hakka, Jyut). 收錄漢語、閩南語、客語、粵語等使用漢字的語言之疊字詞
https://github.com/rutopio/reduplication-dictionary-mandarin-minnan-hakka-jyut
chinese chinese-characters chinese-dictionary hakka jyutcitzi-characters linguistics minnan nlp wordbook
Last synced: 9 months ago
JSON representation
Reduplications of Chinese characters (Mandarin, Minnan, Hakka, Jyut). 收錄漢語、閩南語、客語、粵語等使用漢字的語言之疊字詞
- Host: GitHub
- URL: https://github.com/rutopio/reduplication-dictionary-mandarin-minnan-hakka-jyut
- Owner: rutopio
- License: mit
- Created: 2023-01-11T10:18:15.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-02-21T10:55:36.000Z (almost 3 years ago)
- Last Synced: 2025-01-31T16:54:31.005Z (10 months ago)
- Topics: chinese, chinese-characters, chinese-dictionary, hakka, jyutcitzi-characters, linguistics, minnan, nlp, wordbook
- Language: Jupyter Notebook
- Homepage:
- Size: 4.41 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Reduplication Dictionary: Mandarin, Minnan, Hakka, Jyut
# 漢語、閩南語、客語、粵語:漢字疊字 / 疊詞表
# 汉语、闽南语、客语、粤语:汉字叠字 / 叠词表
This repo contains reduplications of those using Chinese characters (a.k.a. Hanzi) languages, such as Mandarin, Minnan (a.k.a. Southern Min), Hakka and Jyut (a.k.a. Yue / Cantonese). These data can be used for linguistic analysis and NLP research.
本字典收集使用漢字的漢語、閩南語、客語、粵語(廣東話)的疊字詞,以供語言學分析及自然語言處理(NLP)研究。
Reduplications are a special formed type in east asian cultural sphere (a.k.a. chinese character sphere). It can be use in chengyu or adjective modifier. For instance, `紅 (âng)` in Minnan means *red*, `紅紅 (âng-âng)` means *so red*, and `紅紅紅 (âng-âng-âng)` means *very red*.
## 定義 Define type
- `AA`:相鄰的疊字,例如「**天天**」、「**打打**球」、「水**汪汪**」、「**楚楚**可憐」等,皆屬此類型。
- `ABA`:間隔一個,例如「**白**雲**白**」、「**不**明**不**白」、「以**牙**還**牙**」等,皆屬此類型。
- `ABCA`:間隔兩個,例如「**為**所欲**為**」、「**亞**美尼**亞**」等,皆屬此類型。
- and so on...
## 資料來源 Dictionary & Wordbook Reference
Most of raw data refer to dictionaries and IME wordbooks.
多數的原始資料來源為字典以及輸入法詞庫。
### 漢語 Mandarin
- 《常用疊字分類詞典》,賴慶雄、李炳傑著,螢火蟲出版社(2019),ISBN: 9789864521753
### 閩南語 Minnan
- Rime IME Wordbook
### 客語 Hakka
- [ldkrsi@jieba-Hakka](https://github.com/ldkrsi/jieba-Hakka)
### 粵語 Jyut
- [rime@Rime Cantonese input schema](https://github.com/rime/rime-cantonese)