Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/machow/cantocut
https://github.com/machow/cantocut
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/machow/cantocut
- Owner: machow
- Created: 2018-04-10T22:53:29.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-11-19T20:35:48.000Z (about 6 years ago)
- Last Synced: 2024-11-16T09:24:24.354Z (about 2 months ago)
- Language: Python
- Size: 573 KB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
cantocut
========Cantonese sentence segmentation using the A-Star algorithm.
Example
-------```
from cantocut import searchersearcher('我鍾意食叉烧包')
```References
-----------The vocabulary files were pulled from the speech analysis toolkit [SPPAS](http://www.sppas.org/index.html).
For details on their creation and use in that project, see the following publication.```
Roxana Fung, Brigitte Bigi* (2015). Automatic Word Segmentation for Spoken Cantonese, The Oriental Chapter of COCOSDA (International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques).
```