https://github.com/eimg/myanmar-text-breaker
Syllable and word, breaker/boundary-segmentation for Myanmar text in JavaScript
https://github.com/eimg/myanmar-text-breaker
javascript nlp
Last synced: about 2 months ago
JSON representation
Syllable and word, breaker/boundary-segmentation for Myanmar text in JavaScript
- Host: GitHub
- URL: https://github.com/eimg/myanmar-text-breaker
- Owner: eimg
- License: mit
- Created: 2018-03-24T10:46:39.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2024-02-06T15:43:34.000Z (about 1 year ago)
- Last Synced: 2024-07-31T20:29:56.508Z (9 months ago)
- Topics: javascript, nlp
- Language: JavaScript
- Homepage:
- Size: 342 KB
- Stars: 31
- Watchers: 4
- Forks: 8
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-Myanmar - Myanmar Text Breaker - segmentation for Myanmar text in JavaScript | (Myanmar NLP)
README
# Myanmar Text Breaker
Syllable and word, breaker/boundary-segmentation for Myanmar text in JavaScript.
## Usage
```Javascript
var syllable = require("./syllable-breaker");
var word = require("./word-breaker");console.log( syllable('အင်္ဂါနေ့၏ဂြိုဟ်ကောင်သည်ခြင်္သေ့ဖြစ်သည်') );
// => [ 'အင်္ဂါ', 'နေ့', '၏', 'ဂြိုဟ်', 'ကောင်', 'သည်', 'ခြင်္သေ့', 'ဖြစ်', 'သည်' ]console.log( word('ဘယ်အရာမဆိုအရာရာတိုင်းအဆိုးအကောင်းယှဉ်တွဲနေတယ်') );
// => [ 'ဘယ်အရာမဆို', 'အရာရာတိုင်း', 'အဆိုးအကောင်း', 'ယှဉ်တွဲ', 'နေ', 'တယ်' ]
```
## Credit
* Syllable breaker is a JavaScript port of [MyanmarParser-Py](https://github.com/thantthet/MyanmarParser-Py).
* Word breaker is based on following data
* dict-words.txt -> [mydict-mmnlp-words.txt](https://github.com/trhura/pango-myanmar/tree/master/data)
* common-words.txt -> [mydict-common-words.txt](https://github.com/trhura/pango-myanmar/tree/master/data)
* stop-words.txt -> [stop_words.txt](https://github.com/swanhtet1992/myanmar-data)