{"id":13936395,"url":"https://github.com/deeppavlov/ner","last_synced_at":"2025-04-06T03:10:31.198Z","repository":{"id":27841093,"uuid":"103505457","full_name":"deeppavlov/ner","owner":"deeppavlov","description":"Named Entity Recognition ","archived":false,"fork":false,"pushed_at":"2023-05-22T21:11:48.000Z","size":119,"stargazers_count":332,"open_issues_count":9,"forks_count":63,"subscribers_count":25,"default_branch":"master","last_synced_at":"2025-03-30T02:07:21.765Z","etag":null,"topics":["deep-learning","named-entity-recognition","natural-language-processing","natural-language-understanding","neural-network","nlp-machine-learning"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deeppavlov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-14T08:15:43.000Z","updated_at":"2025-03-14T15:33:19.000Z","dependencies_parsed_at":"2024-11-08T10:33:09.581Z","dependency_job_id":null,"html_url":"https://github.com/deeppavlov/ner","commit_stats":null,"previous_names":["deepmipt/ner"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeppavlov%2Fner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeppavlov%2Fner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeppavlov%2Fner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeppavlov%2Fner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deeppavlov","download_url":"https://codeload.github.com/deeppavlov/ner/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247427006,"owners_count":20937201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","named-entity-recognition","natural-language-processing","natural-language-understanding","neural-network","nlp-machine-learning"],"created_at":"2024-08-07T23:02:37.650Z","updated_at":"2025-04-06T03:10:31.174Z","avatar_url":"https://github.com/deeppavlov.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# This repository is outdated please move to https://github.com/deepmipt/DeepPavlov\n\n# Neural Networks for Named Entity Recognition\n\nIn this repo you can find several neural network architectures for named entity recognition from the paper \"_Application of a Hybrid Bi-LSTM-CRF model to the task of Russian Named Entity Recognition_\" https://arxiv.org/pdf/1709.09686.pdf, which is inspired by LSTM+CRF architecture from https://arxiv.org/pdf/1603.01360.pdf.\n\nNER class from ner/network.py provides methods for construction, training and inference neural networks for Named Entity Recognition.\n\nWe provide pre-trained CNN model for Russian Named Entity Recognition.\nThe model was trained on three datatasets:\n\n- Gareev corpus [1] (obtainable by request to authors)\n- FactRuEval 2016 [2]\n- NE3 (extended Persons-1000) [3, 4]\n\nThe pre-trained model can recognize such entities as:\n\n- Persons (PER)\n- Locations (LOC)\n- Organizations (ORG)\n\nAn example of usage of the pre-trained model is provided in [example.ipynb](https://github.com/deepmipt/ner/blob/master/example.ipynb).\n\nRemark: at training stage the corpora were lemmatized and lowercased.\nSo text must be tokenized and lemmatized and lowercased before feeding it into the model.\n\nThe F1 measure for presented model along with other published solution provided in the table below:\n\n| Models                | Gareev’s dataset | Persons-1000 | FactRuEval 2016 |\n|---------------------- |:----------------:|:------------:|:---------------:|\n| Gareev et al. [1]     | 75.05            |              |                 |\n| Malykh et al. [5]     | 62.49            |              |                 |\n| Trofimov  [6]         |                  | 95.57        |                 |\n| Rubaylo et al. [7]    |                  |              | 78.13           |\n| Sysoev et al. [8]     |                  |              | 74.67           |\n| Ivanitsky et al.  [9] |                  |              | **87.88**       |\n| Mozharova et al.  [10] |                  | 97.21        |                 |\n| Our (Bi-LSTM+CRF)     | **87.17**        | **99.26**    | 82.10           ||\n\n### Usage\n\n#### Installing\nThe toolkit is implemented in Python 3 and requires a number of packages. To install all needed packages use:\n```\n$ pip3 install -r requirements.txt\n```\n\nor\n\n```\n$ pip3 install git+https://github.com/deepmipt/ner\n```\n\nWarning: there is no GPU version of TensorFlow specified in the requirements file\n\n#### Command-Line Interface\nThe simplest way to use pre-trained Russian NER model is via command line interface:\n\n    $ echo \"На конспирологическом саммите в США глава Федерального Бюро Расследований сделал невероятное заявление\" | ./ner.py\n\n    На O\n    конспирологическом O\n    саммите O\n    в O\n    США B-LOC\n    глава O\n    Федерального B-ORG\n    Бюро I-ORG\n    Расследований I-ORG\n    сделал O\n    невероятное O\n    заявление O\n\nAnd for interactive usage simply type:\n\n    $ ./ner.py\n\n#### Usage as module\n\n```\n\u003e\u003e\u003e import ner\n\u003e\u003e\u003e extractor = ner.Extractor()\n\u003e\u003e\u003e for m in extractor(\"На конспирологическом саммите в США глава Федерального Бюро Расследований сделал невероятное заявление\"):\n...     print(m)\nMatch(tokens=[Token(span=(32, 35), text='США')], span=Span(start=32, end=35), type='LOC')\nMatch(tokens=[Token(span=(42, 54), text='Федерального'), Token(span=(55, 59), text='Бюро'), Token(span=(60, 73), text='Расследований')], span=Span(start=42, end=73), type='ORG')\n```\n\n### Training\n\nTo see how to train the network and what format of data is required see [training_example.ipynb](https://github.com/deepmipt/ner/blob/master/training_example.ipynb) jupyter notebook.\n\n### Literature\n\n[1] - Rinat Gareev, Maksim Tkachenko, Valery Solovyev, Andrey Simanovsky, Vladimir Ivanov: Introducing Baselines for Russian Named Entity Recognition. Computational Linguistics and Intelligent Text Processing, 329 -- 342 (2013).\n\n[2] - https://github.com/dialogue-evaluation/factRuEval-2016\n\n[3] - http://ai-center.botik.ru/Airec/index.php/ru/collections/28-persons-1000\n\n[4] - http://labinform.ru/pub/named_entities/descr_ne.htm\n\n[5] -  Reproducing Russian NER Baseline Quality without Additional Data. In proceedings of the 3rd International Workshop on ConceptDiscovery in Unstructured Data, Moscow, Russia, 54 – 59 (2016)\n\n[6] - Rubaylo A. V., Kosenko M. Y.: Software utilities for natural language information\nretrievial. Almanac of modern science and education, Volume 12 (114), 87 – 92.(2016)\n\n[7] - Sysoev A. A., Andrianov I. A.: Named Entity Recognition in Russian: the Power of Wiki-Based Approach. dialog-21.ru\n\n[8] - Ivanitskiy Roman, Alexander Shipilo, Liubov Kovriguina: Russian Named Entities Recognition and Classification Using Distributed Word and Phrase Representations. In SIMBig, 150 – 156. (2016).\n\n[9] - Mozharova V., Loukachevitch N.: Two-stage approach in Russian named entity recognition. In Intelligence, Social Media and Web (ISMW FRUCT), 2016 International FRUCT Conference, 1 – 6 (2016)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeppavlov%2Fner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeeppavlov%2Fner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeppavlov%2Fner/lists"}