https://github.com/dcrebbin/cantonese-data
Various Cantonese (YUE) datasets
https://github.com/dcrebbin/cantonese-data
Last synced: 12 months ago
JSON representation
Various Cantonese (YUE) datasets
- Host: GitHub
- URL: https://github.com/dcrebbin/cantonese-data
- Owner: dcrebbin
- Created: 2024-03-27T14:47:14.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-24T11:54:59.000Z (almost 2 years ago)
- Last Synced: 2025-03-15T00:16:05.803Z (about 1 year ago)
- Language: C#
- Homepage:
- Size: 1.46 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Cantonese Datasets
Various Cantonese datasets created by https://withlangpal.com using data provided by https://words.hk, https://cantowords.com & https://lshk.org
### Datasets:
1. mapped-yue-by-frequency
A JSON map containing Cantonese words and their standard Jyutping romanization (alternatives are available below) as well as an array of English translations, ordered by usage frequency (based on Word List (Graded, with Translations) - Full_20211207)

2. yue-jyutping-conversion
A JSON map to convert any Chinese characters into its appropriate Jyutping (most characters have more than 1 potential Jyutping conversion)

Run:
1. Install node
2. node conversion.js