{"id":14958671,"url":"https://github.com/hironsan/anago","last_synced_at":"2025-05-15T04:06:02.444Z","repository":{"id":22263542,"uuid":"95491035","full_name":"Hironsan/anago","owner":"Hironsan","description":"Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.","archived":false,"fork":false,"pushed_at":"2022-12-07T23:44:31.000Z","size":7689,"stargazers_count":1482,"open_issues_count":46,"forks_count":366,"subscribers_count":59,"default_branch":"master","last_synced_at":"2025-05-15T04:05:52.687Z","etag":null,"topics":["deep-learning","keras","machine-learning","named-entity-recognition","natural-language-processing","sequence-labeling"],"latest_commit_sha":null,"homepage":"https://anago.herokuapp.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Hironsan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-06-26T21:28:36.000Z","updated_at":"2025-05-14T07:45:14.000Z","dependencies_parsed_at":"2023-01-14T08:00:34.527Z","dependency_job_id":null,"html_url":"https://github.com/Hironsan/anago","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fanago","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fanago/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fanago/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hironsan%2Fanago/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Hironsan","download_url":"https://codeload.github.com/Hironsan/anago/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254270646,"owners_count":22042859,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","keras","machine-learning","named-entity-recognition","natural-language-processing","sequence-labeling"],"created_at":"2024-09-24T13:17:47.869Z","updated_at":"2025-05-15T04:05:57.424Z","avatar_url":"https://github.com/Hironsan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# anaGo\n\n[![Codacy Badge](https://api.codacy.com/project/badge/Grade/d746380077844b50b1cb95db1b631d18)](https://app.codacy.com/app/Hironsan/anago?utm_source=github.com\u0026utm_medium=referral\u0026utm_content=Hironsan/anago\u0026utm_campaign=Badge_Grade_Dashboard)\n\n**anaGo** is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras.\n\nanaGo can solve sequence labeling tasks such as named entity recognition (NER), part-of-speech tagging (POS tagging), semantic role labeling (SRL) and so on. Unlike traditional sequence labeling solver, anaGo don't need to define any language dependent features. Thus, we can easily use anaGo for any languages.\n\nAs an example of anaGo, the following image shows named entity recognition in English:\n\n[anaGo Demo](https://anago.herokuapp.com/)\n\n![English NER](./docs/images/anago.gif)\n\n\u003c!--\n![English NER](https://github.com/Hironsan/anago/blob/docs/docs/images/example.en2.png?raw=true)\n\n![Japanese NER](https://github.com/Hironsan/anago/blob/docs/docs/images/example.ja2.png?raw=true)\n--\u003e\n\n## Get Started\n\nIn anaGo, the simplest type of model is the `Sequence` model. Sequence model includes essential methods like `fit`, `score`, `analyze` and `save`/`load`. For more complex features, you should use the anaGo modules such as `models`, `preprocessing` and so on.\n\nHere is the data loader:\n\n```python\n\u003e\u003e\u003e from anago.utils import load_data_and_labels\n\n\u003e\u003e\u003e x_train, y_train = load_data_and_labels('train.txt')\n\u003e\u003e\u003e x_test, y_test = load_data_and_labels('test.txt')\n\u003e\u003e\u003e x_train[0]\n['EU', 'rejects', 'German', 'call', 'to', 'boycott', 'British', 'lamb', '.']\n\u003e\u003e\u003e y_train[0]\n['B-ORG', 'O', 'B-MISC', 'O', 'O', 'O', 'B-MISC', 'O', 'O']\n```\n\nYou can now iterate on your training data in batches:\n\n```python\n\u003e\u003e\u003e import anago\n\n\u003e\u003e\u003e model = anago.Sequence()\n\u003e\u003e\u003e model.fit(x_train, y_train, epochs=15)\nEpoch 1/15\n541/541 [==============================] - 166s 307ms/step - loss: 12.9774\n...\n```\n\nEvaluate your performance in one line:\n\n```python\n\u003e\u003e\u003e model.score(x_test, y_test)\n0.802  # f1-micro score\n# For more performance, you have to use pre-trained word embeddings.\n# For now, anaGo's best score is 90.94 f1-micro score.\n```\n\nOr tagging text on new data:\n\n```python\n\u003e\u003e\u003e text = 'President Obama is speaking at the White House.'\n\u003e\u003e\u003e model.analyze(text)\n{\n    \"words\": [\n        \"President\",\n        \"Obama\",\n        \"is\",\n        \"speaking\",\n        \"at\",\n        \"the\",\n        \"White\",\n        \"House.\"\n    ],\n    \"entities\": [\n        {\n            \"beginOffset\": 1,\n            \"endOffset\": 2,\n            \"score\": 1,\n            \"text\": \"Obama\",\n            \"type\": \"PER\"\n        },\n        {\n            \"beginOffset\": 6,\n            \"endOffset\": 8,\n            \"score\": 1,\n            \"text\": \"White House.\",\n            \"type\": \"LOC\"\n        }\n    ]\n}\n```\n\nTo download a pre-trained model, call `download` function:\n\n```python\n\u003e\u003e\u003e from anago.utils import download\n\n\u003e\u003e\u003e url = 'https://s3-ap-northeast-1.amazonaws.com/dev.tech-sketch.jp/chakki/public/conll2003_en.zip'\n\u003e\u003e\u003e weights, params, preprocessor = download(url)\n\u003e\u003e\u003e model = anago.Sequence.load(weights, params, preprocessor)\n\u003e\u003e\u003e model.score(x_test, y_test)\n0.909446369856927\n```\n\nIf you want to use ELMo for better performance(f1: 92.22), you can use [ELModel](https://github.com/Hironsan/anago/blob/master/anago/models.py#L125) and [ELMoTransformer](https://github.com/Hironsan/anago/blob/master/anago/preprocessing.py#L197):\n\n```python\n# Transforming datasets.\np = ELMoTransformer()\np.fit(x_train, y_train)\n\n# Building a model.\nmodel = ELModel(...)\nmodel, loss = model.build()\nmodel.compile(loss=loss, optimizer='adam')\n\n# Training the model.\ntrainer = Trainer(model, preprocessor=p)\ntrainer.train(x_train, y_train, x_test, y_test)\n```\n\nFor futher details, see [anago/examples/elmo_example.py](https://github.com/Hironsan/anago/blob/master/examples/elmo_example.py).\n\n## Feature Support\n\nanaGo supports following features:\n\n* Model Training\n* Model Evaluation\n* Tagging Text\n* Custom Model Support\n* Downloading pre-trained model\n* GPU Support\n* Character feature\n* CRF Support\n* Custom Callback Support\n* :collision:(new) ELMo\n\nanaGo officially supports Python 3.4–3.6.\n\n## Installation\n\nTo install anaGo, simply use `pip`:\n\n```bash\n$ pip install anago\n```\n\nor install from the repository:\n\n```bash\n$ git clone https://github.com/Hironsan/anago.git\n$ cd anago\n$ python setup.py install\n```\n\n## Documentation\n\n(coming soon)\n\n\u003c!--\nanaGo supports pre-trained word embeddings like [GloVe vectors](https://nlp.stanford.edu/projects/glove/).\n--\u003e\n\n## Reference\n\nThis library is based on the following papers:\n\n* Lample, Guillaume, et al. \"[Neural architectures for named entity recognition.](https://arxiv.org/abs/1603.01360)\" arXiv preprint arXiv:1603.01360 (2016).\n* Peters, Matthew E., et al. \"[Deep contextualized word representations.](https://arxiv.org/abs/1802.05365)\" arXiv preprint arXiv:1802.05365 (2018).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhironsan%2Fanago","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhironsan%2Fanago","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhironsan%2Fanago/lists"}