https://github.com/congnghia0609/ntc-vntok
ntc-vntok is a library Tokenizer for the Vietnamese language
https://github.com/congnghia0609/ntc-vntok
nlp ntc-vntok segmentation tokenizer vietnamese
Last synced: 7 months ago
JSON representation
ntc-vntok is a library Tokenizer for the Vietnamese language
- Host: GitHub
- URL: https://github.com/congnghia0609/ntc-vntok
- Owner: congnghia0609
- License: apache-2.0
- Created: 2021-01-15T18:03:17.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2021-06-17T19:59:00.000Z (over 4 years ago)
- Last Synced: 2023-09-08T09:20:46.883Z (about 2 years ago)
- Topics: nlp, ntc-vntok, segmentation, tokenizer, vietnamese
- Language: Java
- Homepage: https://github.com/congnghia0609/ntc-vntok
- Size: 5.25 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ntc-vntok
ntc-vntok is a library Tokenizer for the Vietnamese language.## Maven
```Xmlcom.streetcodevn
ntc-vntok
1.0.0```
## Quick start
```java
String s = "VNTok là công cụ tách từ Tiếng Việt.";
System.out.println(s);
VnTok vntok = new VnTok();
String rs = vntok.tokenizeSentence(s);
System.out.println(rs);
//VNTok là công_cụ tách từ Tiếng_Việt .
```## License
This code is under the [Apache License v2](https://www.apache.org/licenses/LICENSE-2.0).