Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sanjibnarzary/bodo-tokenizers
Pre tokenized models for Bodo. This repositoryincludes all the tokenized models to be used in the Neural Machine Translation. The models include pre tokenized models trained using ByteLevelBPETokenizer, BPETokenizer, SentencePieceBPETokenizer, BertWordPieceTokenizer
https://github.com/sanjibnarzary/bodo-tokenizers
bodo indian-language less-resource-languages natural-language-processing nlp nlp-bodo tokenizer
Last synced: 2 days ago
JSON representation
Pre tokenized models for Bodo. This repositoryincludes all the tokenized models to be used in the Neural Machine Translation. The models include pre tokenized models trained using ByteLevelBPETokenizer, BPETokenizer, SentencePieceBPETokenizer, BertWordPieceTokenizer
- Host: GitHub
- URL: https://github.com/sanjibnarzary/bodo-tokenizers
- Owner: sanjibnarzary
- Created: 2020-01-13T06:29:18.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2023-08-21T09:22:02.000Z (about 1 year ago)
- Last Synced: 2023-08-21T09:47:27.313Z (about 1 year ago)
- Topics: bodo, indian-language, less-resource-languages, natural-language-processing, nlp, nlp-bodo, tokenizer
- Homepage:
- Size: 4.55 MB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 0