An open API service indexing awesome lists of open source software.

https://github.com/xnuinside/russian-language-nlp

Repo with collections of 2 types: 1st links to language models, corpuses & tools to process Russian language & 2nd collection of different course/articles & tutorials about NLP & written on RU
https://github.com/xnuinside/russian-language-nlp

Last synced: 5 months ago
JSON representation

Repo with collections of 2 types: 1st links to language models, corpuses & tools to process Russian language & 2nd collection of different course/articles & tutorials about NLP & written on RU

Awesome Lists containing this project

README

          

# russian-language-nlp
Repo with collections of 2 types: 1st links to language models, corpuses & tools to process Russian language & 2nd collection of different course/articles & tutorials about NLP & written on RU

In this repo I try to consolidate all information about articles, packages and researches that relative to NLP on Russian language.
Because a lot of pages that a refer to contains only RU docs or writed on RU language - I will mark lang in [brackets] near the source.

If you want to add something in list - just open the PR.

EN/RU - languages of documentation

### Tools

[EN] https://github.com/natasha/natasha - Natasha. Natasha solves basic NLP tasks for Russian language: tokenization, sentence segmentation, word embedding, morphology tagging, lemmatization, phrase normalization, syntax parsing, NER tagging, fact extraction.
[EN] https://github.com/bureaucratic-labs/dostoevsky - Sentiment analysis library for russian language

### Models

[EN] https://rusvectores.org/en/models/

[EN] https://github.com/buriy/spacy-ru/releases/tag/v2.3_beta - language models for SpaCy. Note for repo - pay attention to version branches and releases, not to master branch. README.md in repo in master only on RU language

[EN] https://github.com/natasha/natasha-spacy - Also SpaCy Ru language model

### Datasets

[EN] https://github.com/buriy/russian-nlp-datasets/releases/tag/ - Pay attention, that datasets putted on Releases, not in source code in repo

### Articles about work with Russian Language in NLP
[EN] https://primer.ai/blog/Russian-Natural-Language-Processing/

[RU] https://habr.com/ru/post/516098/#natasha - About Natasha project

### Videos
[RU] https://www.youtube.com/watch?t=951&v=-7XT_U6hVvk&feature=youtu.be&ab_channel=ODSAIRu - About Natasha

### Useful Jupyter Notebooks
https://nbviewer.jupyter.org/github/natasha/natasha/blob/master/docs.ipynb - Natasha Notebook with Samples

## Part 2
### Useful articles about NLP on RU language

https://habr.com/ru/post/436878/ - BERT — state-of-the-art языковая модель для 104 языков. Туториал по запуску BERT локально и на Google Colab