{"id":13935761,"url":"https://github.com/natasha/yargy","last_synced_at":"2025-04-12T21:29:16.722Z","repository":{"id":38484032,"uuid":"65030656","full_name":"natasha/yargy","owner":"natasha","description":"Rule-based facts extraction for Russian language","archived":false,"fork":false,"pushed_at":"2023-07-24T10:07:03.000Z","size":671,"stargazers_count":307,"open_issues_count":16,"forks_count":41,"subscribers_count":19,"default_branch":"master","last_synced_at":"2024-04-27T23:37:08.199Z","etag":null,"topics":["earley-parser","information-extraction","morphology","nlp","python","russian","tomita","tomita-parser"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/natasha.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-08-05T15:49:24.000Z","updated_at":"2024-04-24T12:37:19.000Z","dependencies_parsed_at":"2023-02-12T21:15:54.836Z","dependency_job_id":"876030d2-a2eb-44f5-8bca-a6c2a5b8d197","html_url":"https://github.com/natasha/yargy","commit_stats":{"total_commits":412,"total_committers":8,"mean_commits":51.5,"dds":0.366504854368932,"last_synced_commit":"d6c74b9da57e32a1a08bd65c4b36a35b3f9f73e3"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/natasha%2Fyargy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/natasha%2Fyargy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/natasha%2Fyargy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/natasha%2Fyargy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/natasha","download_url":"https://codeload.github.com/natasha/yargy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248634240,"owners_count":21137010,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["earley-parser","information-extraction","morphology","nlp","python","russian","tomita","tomita-parser"],"created_at":"2024-08-07T23:02:04.455Z","updated_at":"2025-04-12T21:29:16.702Z","avatar_url":"https://github.com/natasha.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"\u003cimg src=\"https://github.com/natasha/natasha-logos/blob/master/yargy.svg\"\u003e\n\n![CI](https://github.com/natasha/yargy/actions/workflows/test.yml/badge.svg)\n\nYargy uses rules and dictionaries to extract structured information from Russian texts. Yargy is similar to \u003ca href=\"https://yandex.ru/dev/tomita\"\u003eTomita parser\u003c/a\u003e.\n\n## Install\n\nYargy supports Python 3.7+, PyPy 3, depends only on \u003ca href=\"http://github.com/pymorphy2/pymorphy2\"\u003ePymorphy2\u003c/a\u003e.\n\n```bash\n$ pip install yargy\n```\n\n## Usage\n\n```python\nfrom yargy import Parser, rule, and_, not_\nfrom yargy.interpretation import fact\nfrom yargy.predicates import gram\nfrom yargy.relations import gnc_relation\nfrom yargy.pipelines import morph_pipeline\n\n\nName = fact(\n    'Name',\n    ['first', 'last'],\n)\nPerson = fact(\n    'Person',\n    ['position', 'name']\n)\n\nLAST = and_(\n    gram('Surn'),\n    not_(gram('Abbr')),\n)\nFIRST = and_(\n    gram('Name'),\n    not_(gram('Abbr')),\n)\n\nPOSITION = morph_pipeline([\n    'управляющий директор',\n    'вице-мэр'\n])\n\ngnc = gnc_relation()\nNAME = rule(\n    FIRST.interpretation(\n        Name.first\n    ).match(gnc),\n    LAST.interpretation(\n        Name.last\n    ).match(gnc)\n).interpretation(\n    Name\n)\n\nPERSON = rule(\n    POSITION.interpretation(\n        Person.position\n    ).match(gnc),\n    NAME.interpretation(\n        Person.name\n    )\n).interpretation(\n    Person\n)\n\nparser = Parser(PERSON)\n\nmatch = parser.match('управляющий директор Иван Ульянов')\nprint(match)\n\nPerson(\n    position='управляющий директор',\n    name=Name(\n        first='Иван',\n        last='Ульянов'\n    )\n)\n\n```\n\n## Documentation\n\nAll materials are in Russian:\n\n* \u003ca href=\"https://habr.com/ru/post/349864/\"\u003eOverview\u003c/a\u003e\n* \u003ca href=\"https://www.youtube.com/watch?v=NQxzx0qYgK8\"\u003eVideo from workshop\u003c/a\u003e\n* \u003ca href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/index.ipynb\"\u003eGetting started\u003c/a\u003e\n* \u003ca href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/ref.ipynb\"\u003eReference\u003c/a\u003e\n* \u003ca href=\"https://nbviewer.jupyter.org/github/natasha/yargy/blob/master/docs/cookbook.ipynb\"\u003eCookbook\u003c/a\u003e\n* \u003ca href=\"https://github.com/natasha/yargy-examples\"\u003eExamples\u003c/a\u003e\n* \u003ca href=\"https://github.com/natasha/natasha-usage#yargy\"\u003eCode snippets\u003c/a\u003e\n\n## Support\n\n- Chat — https://t.me/natural_language_processing\n- Issues — https://github.com/natasha/yargy/issues\n- Commercial support — https://lab.alexkuk.ru\n\n## Development\n\nDev env\n\n```bash\nbrew install graphviz\n\npython -m venv ~/.venvs/natasha-yargy\nsource ~/.venvs/natasha-yargy/bin/activate\n\npip install -r requirements/dev.txt\npip install -e .\n\npython -m ipykernel install --user --name natasha-yargy\n```\n\nTest + lint\n\n```bash\nmake test\n```\n\nUpdate docs\n\n```bash\nmake exec-docs\n\n# Manually check git diff docs/, commit\n```\n\nRelease\n\n```bash\n# Update setup.py version\n\ngit commit -am 'Up version'\ngit tag v0.16.0\n\ngit push\ngit push --tags\n\n# Github Action builds dist and publishes to PyPi\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnatasha%2Fyargy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnatasha%2Fyargy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnatasha%2Fyargy/lists"}