Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-azeri-nlp
Azerbaijani language processing software, models and datasets.
https://github.com/alexeyev/awesome-azeri-nlp
Last synced: 1 day ago
JSON representation
-
Datasets
- University of Leipzig corpus collection
- Helsinki University corpus
- Latest **azwiki** dump - latest-pages-articles.xml.bz2)
- Azeri at An Crúbadán
- azWaC: Azerbaijani corpus from the web - hosted corpus crawled from the web in 2012, ~94 million words
- Downloadable corpus
- UD project comments
- Mammad Hajili's 160K customer reviews with scores and upvotes
- **az-corpus-nlp**
- AZ summarization
- AZ-EN parallel corpus
- N. Gasimli's MS thesis
- Azerbaijani Named Entity Recognition (NER) Dataset
- UD_Azerbaijani-TueCL
-
Pretrained models
- Polyglot morfessor
- fastText - dimensional fastText vectors provided by the authors
-
Methods/Software
- Azmorph - ALPHA state; however, was [used for web corpora preparation](https://www.sketchengine.eu/wp-content/uploads/Large_Corpora_for_turkic_2012.pdf)
- Wiktionary word forms extraction
- POS-tagging - of-Speech Tagging for Azerbaijani Language. In 2018 IEEE 12th International Conference on Application of Information and Communication Technologies (AICT) (pp. 1-6). IEEE. [**Probable implementation: [aznlp repo](https://github.com/aznlp/azerbaijani-language-pos-tagger)**]
- Stemming paper, 2019 - 66.
-
Online Demos
- Cyrillic ⇄ Latin conversion - based online tool
-
Miscellaneous
- Turkic languages-related resources
- Azeribaijani corpora data review
- Dilmanc - funded Azerbaijani language-related initiative
- Dilmanc EAMT paper
- Apertium page - related resources
- AZNLP github - related software: stemmer, POS-tagger
- MozillaAZ community spellchecker
Programming Languages
Sub Categories