{"id":21692931,"url":"https://github.com/johnsmithm/nertk","last_synced_at":"2025-04-12T10:37:25.296Z","repository":{"id":49360089,"uuid":"356300465","full_name":"johnsmithm/nertk","owner":"johnsmithm","description":"Name Entity Recognition toolkit lunches Entator class - inline annotator within your Jupyter notebook in Python for  name entities in text.","archived":false,"fork":false,"pushed_at":"2021-04-20T08:35:45.000Z","size":43,"stargazers_count":7,"open_issues_count":1,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-26T05:33:08.057Z","etag":null,"topics":["annotation-tool","machine-learning","ner","nlp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/johnsmithm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-04-09T14:33:38.000Z","updated_at":"2024-09-30T13:20:15.000Z","dependencies_parsed_at":"2022-09-12T05:00:43.043Z","dependency_job_id":null,"html_url":"https://github.com/johnsmithm/nertk","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fnertk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fnertk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fnertk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnsmithm%2Fnertk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/johnsmithm","download_url":"https://codeload.github.com/johnsmithm/nertk/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248126024,"owners_count":21051857,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annotation-tool","machine-learning","ner","nlp"],"created_at":"2024-11-25T18:17:50.679Z","updated_at":"2025-04-12T10:37:25.254Z","avatar_url":"https://github.com/johnsmithm.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# nertk - Name Entity Recognition toolkit\n\nEntator class - Annotate name entities in text  inline within your [Jupyter notebook](https://jupyter.org/) in Python. \n\n\n## 1 - Overview\n\nIn a data science or machine learning project, you may prepare and study text with name entities within a Jupyter notebook then need to annotate the data to augment the training or fix errors in your source data.\n\nSince you are already working within a Jupyter notebook, the Entator works inline allowing you to interact with your data and annotate it quickly and easily, syncing straight back to your input data arrays or matrices.\n\nWithin Jupyter, you can easily home in on problem input data - perhaps only misclassified entity - so you can step through and adjust token type just for those items. \n\nThe Entator widget is designed with a flexible API making it quick and easy to get started exploring your dataset, guessing how to work with your data without explicit configuration where possible.\n\nThe project is currently in ALPHA development phase, and I appreciate all feedback on any problems including details on how the current code works or fails to work for the structure of your particular projects.\n\n\n## 2 - Examples\n\nYou can easily combine Entator's interactive components to suit your project. Here are some examples.\n\n### Annotate new samples\n\n\nThen set up Entator to display buttons for each label and each word in the text, click on button to change the label.\n\n```python\nfrom nertk import Entator\nannotator = Entator(labels=['other','person','location'],\n                    inputs=[['John','is','going','to','Germany','tomorrow','.']])\nannotator.run()\n\n#get annotation\nannotator.targets\n```\n\n![Screenshot of Enttor widget in Jupyter](docs/nertk-entator.png)\n\n### Correct annotated  samples\n\n```python\nfrom nertk import Entator\nannotator = Entator(labels=['other','person','location'],\n                    inputs=[['John','is','going','to','Germany','tomorrow','.']],\n                    targets=[['person', 'other', 'other', 'other', 'location', 'other', 'other']])\nannotator.run()\n\n#get annotation\nannotator.targets\n```\n\n## 3 - Installation\n\n### Install from PyPi (recommended)\n\n```\npip install nertk\n```\n\n## 4 - Contact for Feedback\n\nPlease get in touch with any feedback or questions: [Linkedin](https://www.linkedin.com/in/ionmosnoi/). It will be especially useful to understand the structure of your project and what is needed for your data annotation - e.g. extra entity types. There are many ideas on the roadmap, and your input is vital for prioritising these.\n\n## 5 - License\n\nThis code is released under an MIT license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohnsmithm%2Fnertk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjohnsmithm%2Fnertk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohnsmithm%2Fnertk/lists"}