Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with less-resource-languages
A curated list of projects in awesome lists tagged with less-resource-languages .
https://github.com/sinaahmadi/klpt
The Kurdish Language Processing Toolkit
kurdish kurdish-language-processing kurdish-oss kurdish-stemming kurdish-tokenization language-technology less-resource-languages natural-language-processing toolkit
Last synced: 14 Nov 2024
https://github.com/sinaahmadi/KurdishMT
Towards Machine Translation for the Kurdish Language
kurdish kurdish-language-processing less-resource-languages machine-translation nlp
Last synced: 14 Nov 2024
https://github.com/sinaahmadi/ZazaGoraniCorpus
A corpus for the Zazaki and Gorani languages
computational-linguistics corpus corpus-data corpus-linguistics feyli gorani kurdish kurdish-language-processing less-resource-languages natural-language-processing southern-kurdish zazaki
Last synced: 14 Nov 2024
https://github.com/sinaahmadi/KurdishLID
Language identification of Kurdish and Zaza-Gorani languages (& variants)
arabic feyli gorani hawrami kurdish kurdish-language-processing kurdish-oss kurmanji language-identification less-resource-languages persian sorani southern-kurdish turkish zazaki
Last synced: 14 Nov 2024
https://github.com/sanjibnarzary/bodo-tokenizers
Pre tokenized models for Bodo. This repositoryincludes all the tokenized models to be used in the Neural Machine Translation. The models include pre tokenized models trained using ByteLevelBPETokenizer, BPETokenizer, SentencePieceBPETokenizer, BertWordPieceTokenizer
bodo indian-language less-resource-languages natural-language-processing nlp nlp-bodo tokenizer
Last synced: 13 Nov 2024
https://github.com/sinaahmadi/ScriptNormalization
Script Normalization for Unconventional Writing of Perso-Arabic scripts (ACL2023)
acl2023 arabic azeri gilaki gorani kashmiri kurdish kurdish-language-processing kurmanji less-resource-languages mazanderani nlp persian preprocessing script-normalization sindhi sorani turkish urdu
Last synced: 14 Nov 2024