Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/veer66/wordcutpy

A simple word breaker written in Python
https://github.com/veer66/wordcutpy

Last synced: 3 months ago
JSON representation

A simple word breaker written in Python

Awesome Lists containing this project

README

        

wordcutpy
=========
wordcutpy is a simple Thai word breaker written in Python 3+

Installation
------------

````
pip install wordcutpy
````

Example
-------

### Conventional verison

````python
#! -*- coding: UTF8 -*-
from wordcut import Wordcut
if __name__ == '__main__':
with open('bigthai.txt', encoding="UTF-8") as dict_file:
word_list = list(set([w.rstrip() for w in dict_file.readlines()]))
wordcut = Wordcut(word_list)
print(wordcut.tokenize("กากา cat หมา"))
````

### Simplified version

````python
#! -*- coding: UTF8 -*-
from wordcut import Wordcut
wordcut = Wordcut.bigthai()
print(wordcut.tokenize("กากา cat หมา"))
````

Test
----

### Run tests

````shell
python -m unittest discover -s tests
````