{"id":23236998,"url":"https://github.com/nicolay-r/bulk-ner","last_synced_at":"2025-06-17T19:05:55.640Z","repository":{"id":215108681,"uuid":"738099722","full_name":"nicolay-r/bulk-ner","owner":"nicolay-r","description":"Tiny no-string framework for a quick third-party models binding for entities extraction from cells of long tabular data","archived":false,"fork":false,"pushed_at":"2025-03-10T13:14:44.000Z","size":131,"stargazers_count":4,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-25T09:45:45.280Z","etag":null,"topics":["arekit","bert","bert-model","colab","colab-notebook","deeppavlov","ner","pipelines","spreadsheet","transformer-model","transformers"],"latest_commit_sha":null,"homepage":"https://github.com/nicolay-r/AREkit","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nicolay-r.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-02T12:24:06.000Z","updated_at":"2025-03-16T01:20:29.000Z","dependencies_parsed_at":"2024-01-06T14:27:54.482Z","dependency_job_id":"6875809e-88df-4164-9f37-45b42cbae4cd","html_url":"https://github.com/nicolay-r/bulk-ner","commit_stats":null,"previous_names":["nicolay-r/ner-service","nicolay-r/bulk-ner"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/nicolay-r/bulk-ner","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolay-r%2Fbulk-ner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolay-r%2Fbulk-ner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolay-r%2Fbulk-ner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolay-r%2Fbulk-ner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nicolay-r","download_url":"https://codeload.github.com/nicolay-r/bulk-ner/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nicolay-r%2Fbulk-ner/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260424635,"owners_count":23007036,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arekit","bert","bert-model","colab","colab-notebook","deeppavlov","ner","pipelines","spreadsheet","transformer-model","transformers"],"created_at":"2024-12-19T04:13:24.089Z","updated_at":"2025-06-17T19:05:50.618Z","avatar_url":"https://github.com/nicolay-r.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# bulk-ner 0.24.1 \n![](https://img.shields.io/badge/Python-3.9-brightgreen.svg)\n![](https://img.shields.io/badge/AREkit-0.25.0-orange.svg)\n[![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nicolay-r/ner-service/blob/main/NER_annotation_service.ipynb)\n[![twitter](https://img.shields.io/twitter/url/https/shields.io.svg?style=social)](https://x.com/nicolayr_/status/1842300499011260827)\n[![PyPI downloads](https://img.shields.io/pypi/dm/bulk-ner.svg)](https://pypistats.org/packages/bulk-ner)\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"logo.png\"/\u003e\n\u003c/p\u003e\n\nA no-strings inference implementation framework [Named Entity Recognition (NER)](https://en.wikipedia.org/wiki/Named-entity_recognition) service of wrapped AI models powered by \n[AREkit](https://github.com/nicolay-r/AREkit) and the related [text-processing pipelines](https://github.com/nicolay-r/AREkit/wiki/Pipelines:-Text-Processing).\n\nThe key benefits of this tiny framework are as follows:\n1. ☑️ Native support of batching;\n2. ☑️ Native long-input contexts handling.\n\n# Installation\n\n```bash\npip install bulk-ner==0.24.1\n```\n\n# Usage\n\nThis is an example for using `DeepPavlov==1.3.0` as an adapter for NER models passed via `--adapter` parameter:\n\n```bash\npython -m bulk_ner.annotate \\\n    --src \"test/data/test.tsv\" \\\n    --prompt \"{text}\" \\\n    --batch-size 10 \\\n    --adapter \"dynamic:models/dp_130.py:DeepPavlovNER\" \\\n    --output \"test-annotated.jsonl\" \\\n    %% \\\n    --model \"ner_ontonotes_bert_mult\"\n```\n\nYou can choose the other models via `--model` parameter.\n\nList of the supported models is available here: \nhttps://docs.deeppavlov.ai/en/master/features/models/NER.html\n\n## Deploy your model\n\n\u003e **Quick example**: Check out the [default DeepPavlov wrapper implementation](/models/dp_130.py)\n\nAll you have to do is to implement the `BaseNER` class that has the following protected method:\n* `_forward(sequences)` -- expected to return two lists of the same length:\n    * `terms` -- related to the list of atomic elements of the text (usually words)\n    * `labels` -- B-I-O labels for each term.\n  \n\n## Powered by\n\n* AREkit [[github]](https://github.com/nicolay-r/AREkit)\n\n\u003cp float=\"left\"\u003e\n\u003ca href=\"https://github.com/nicolay-r/AREkit\"\u003e\u003cimg src=\"https://github.com/nicolay-r/ARElight/assets/14871187/01232f7a-970f-416c-b7a4-1cda48506afe\"/\u003e\u003c/a\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnicolay-r%2Fbulk-ner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnicolay-r%2Fbulk-ner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnicolay-r%2Fbulk-ner/lists"}