{"id":16194695,"url":"https://github.com/hironsan/neraug","last_synced_at":"2026-03-11T21:03:57.005Z","repository":{"id":57445501,"uuid":"388022846","full_name":"Hironsan/neraug","owner":"Hironsan","description":"A text augmentation tool for named entity recognition.","archived":false,"fork":false,"pushed_at":"2021-07-22T11:39:07.000Z","size":124,"stargazers_count":52,"open_issues_count":0,"forks_count":2,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-21T18:03:35.524Z","etag":null,"topics":["deep-learning","machine-learning","natural-language-processing","nlp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Hironsan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-07-21T06:52:25.000Z","updated_at":"2025-02-12T06:39:24.000Z","dependencies_parsed_at":"2022-09-26T17:30:56.444Z","dependency_job_id":null,"html_url":"https://github.com/Hironsan/neraug","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/Hironsan/neraug","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fneraug","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fneraug/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fneraug/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fneraug/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Hironsan","download_url":"https://codeload.github.com/Hironsan/neraug/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fneraug/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261166126,"owners_count":23118981,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","machine-learning","natural-language-processing","nlp"],"created_at":"2024-10-10T08:24:35.566Z","updated_at":"2026-03-11T21:03:51.958Z","avatar_url":"https://github.com/Hironsan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# neraug\n\nThis python library helps you with augmenting text data for named entity recognition.\n\n## Augmentation Example\n\n![](./docs/images/example.png)  \nReference from [An Analysis of Simple Data Augmentation for Named Entity Recognition](https://aclanthology.org/2020.coling-main.343/)\n\n## Installation\n\nTo install the library:\n\n```bash\npip install neraug\n```\n\n## Usage\n\nOne of the example algorithms: `DictionaryReplacement`:\n\n```python\n\u003e\u003e\u003e from neraug.augmentator import DictionaryReplacement\n\u003e\u003e\u003e from neraug.scheme import IOBES\n\n\u003e\u003e\u003e ne_dic = {'Tokyo Big Sight': 'LOC'}\n\u003e\u003e\u003e augmentator = DictionaryReplacement(ne_dic, str.split, IOBES)\n\u003e\u003e\u003e x = ['I', 'went', 'to', 'Tokyo']\n\u003e\u003e\u003e y = ['O', 'O', 'O', 'S-LOC']\n\u003e\u003e\u003e x_augs, y_augs = augmentator.augment(x, y, n=1)   \n\u003e\u003e\u003e x_augs\n[['I', 'went', 'to', 'Tokyo', 'Big', 'Sight']]\n\u003e\u003e\u003e y_augs\n[['O', 'O', 'O', 'B-LOC', 'I-LOC', 'E-LOC']]\n```\n\nThe library supports the following algorithms:\n\n- DictionaryReplacement\n- LabelWiseTokenReplacement\n- MentionReplacement\n- ShuffleWithinSegment\n\nand supports the following scheme:\n\n- IOB2\n- IOBES\n- BILOU\n\n## Reference\n\nAppreciate for the following research:\n\n- [An Analysis of Simple Data Augmentation for Named Entity Recognition](https://aclanthology.org/2020.coling-main.343/)\n\n## Citation\n\n```latex\n@misc{neraug,\n  title={neraug: A data augmentation tool for named entity recognition},\n  author={Hiroki Nakayama},\n  url={https://github.com/Hironsan/neraug},\n  year={2021}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhironsan%2Fneraug","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhironsan%2Fneraug","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhironsan%2Fneraug/lists"}