https://github.com/xnuinside/russian-language-nlp
Repo with collections of 2 types: 1st links to language models, corpuses & tools to process Russian language & 2nd collection of different course/articles & tutorials about NLP & written on RU
https://github.com/xnuinside/russian-language-nlp
Last synced: 5 months ago
JSON representation
Repo with collections of 2 types: 1st links to language models, corpuses & tools to process Russian language & 2nd collection of different course/articles & tutorials about NLP & written on RU
- Host: GitHub
- URL: https://github.com/xnuinside/russian-language-nlp
- Owner: xnuinside
- Created: 2021-01-07T10:16:23.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2021-02-02T10:32:53.000Z (over 5 years ago)
- Last Synced: 2025-04-06T16:18:13.111Z (about 1 year ago)
- Size: 4.88 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# russian-language-nlp
Repo with collections of 2 types: 1st links to language models, corpuses & tools to process Russian language & 2nd collection of different course/articles & tutorials about NLP & written on RU
In this repo I try to consolidate all information about articles, packages and researches that relative to NLP on Russian language.
Because a lot of pages that a refer to contains only RU docs or writed on RU language - I will mark lang in [brackets] near the source.
If you want to add something in list - just open the PR.
EN/RU - languages of documentation
### Tools
[EN] https://github.com/natasha/natasha - Natasha. Natasha solves basic NLP tasks for Russian language: tokenization, sentence segmentation, word embedding, morphology tagging, lemmatization, phrase normalization, syntax parsing, NER tagging, fact extraction.
[EN] https://github.com/bureaucratic-labs/dostoevsky - Sentiment analysis library for russian language
### Models
[EN] https://rusvectores.org/en/models/
[EN] https://github.com/buriy/spacy-ru/releases/tag/v2.3_beta - language models for SpaCy. Note for repo - pay attention to version branches and releases, not to master branch. README.md in repo in master only on RU language
[EN] https://github.com/natasha/natasha-spacy - Also SpaCy Ru language model
### Datasets
[EN] https://github.com/buriy/russian-nlp-datasets/releases/tag/ - Pay attention, that datasets putted on Releases, not in source code in repo
### Articles about work with Russian Language in NLP
[EN] https://primer.ai/blog/Russian-Natural-Language-Processing/
[RU] https://habr.com/ru/post/516098/#natasha - About Natasha project
### Videos
[RU] https://www.youtube.com/watch?t=951&v=-7XT_U6hVvk&feature=youtu.be&ab_channel=ODSAIRu - About Natasha
### Useful Jupyter Notebooks
https://nbviewer.jupyter.org/github/natasha/natasha/blob/master/docs.ipynb - Natasha Notebook with Samples
## Part 2
### Useful articles about NLP on RU language
https://habr.com/ru/post/436878/ - BERT — state-of-the-art языковая модель для 104 языков. Туториал по запуску BERT локально и на Google Colab