An open API service indexing awesome lists of open source software.

https://github.com/kabirkhan/recon

Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.
https://github.com/kabirkhan/recon

machine-learning model-insights natural-language-processing ner

Last synced: 10 months ago
JSON representation

Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.

Awesome Lists containing this project

README

          


Recon



Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsitencies and get insights on improving the quality of your data.




PyPi Package version


GitHub Actions Build badge



Codecov badge


PyPi Package license

---

**Documentation**: https://kabirkhan.github.io/recon

**Source Code**: https://github.com/kabirkhan/recon

---

Recon is a library to help you fix your annotated NER data and identify examples that are hardest for your model to predict so you can strategically prioritize the examples you annotate.

The key features are:

* **Data Validation and Cleanup**: Easily Validate the format of your NER data. Filter overlapping Entity Annotations, fix missing properties.
* **Statistics**: Get statistics on your data. From how many annotations you have for each label, to more complicated metrics like quality scores for the balance of your dataset.
* **Model Insights**: Analyze how well your model does on your Dataset. Identify the top errors your model is making so you can prioritize data collection and correction strategically.
* **Dataset Management**: Recon provides `Dataset` and `Corpus` containers to manage the train/dev/test split of your data and apply the same functions across all splits in your data + a concatenation of all examples. Operate inplace to consistently transform your data with reliable tracking and the ability to version and rollback changes.
* **Serializable Dataset**: Serialize and Deserialize your data to and from JSON to the Recon type system.
* **Type Hints**: Comprehensive Typing system based on Python 3.7+ Type Hints

## Requirements

Python 3.7 +

* spaCy
* Pydantic (Type system and JSON Serialization)
* Typer (CLI).

## Installation

```console
$ pip install reconner
---> 100%
Successfully installed reconner
```

## License

This project is licensed under the terms of the MIT license.