Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/machow/cantocut


https://github.com/machow/cantocut

Last synced: about 1 month ago
JSON representation

Awesome Lists containing this project

README

        

cantocut
========

Cantonese sentence segmentation using the A-Star algorithm.

Example
-------

```
from cantocut import searcher

searcher('我鍾意食叉烧包')
```

References
-----------

The vocabulary files were pulled from the speech analysis toolkit [SPPAS](http://www.sppas.org/index.html).
For details on their creation and use in that project, see the following publication.

```
Roxana Fung, Brigitte Bigi* (2015). Automatic Word Segmentation for Spoken Cantonese, The Oriental Chapter of COCOSDA (International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques).
```