https://github.com/geoffbacon/cerberus
Cerberus is an app that reduces the annotation burden of linguists
https://github.com/geoffbacon/cerberus
allennlp linguistics natural-language-processing streamlit
Last synced: about 1 year ago
JSON representation
Cerberus is an app that reduces the annotation burden of linguists
- Host: GitHub
- URL: https://github.com/geoffbacon/cerberus
- Owner: geoffbacon
- License: mit
- Created: 2019-10-17T22:50:20.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2020-11-10T00:45:08.000Z (over 5 years ago)
- Last Synced: 2024-11-04T14:43:52.502Z (over 1 year ago)
- Topics: allennlp, linguistics, natural-language-processing, streamlit
- Language: Jupyter Notebook
- Size: 581 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Cerberus
### What is Cerberus?
Cerberus is an app that reduces the annotation burden of linguists. It does this by making it easy for linguists to apply state-of-the-art natural language processing models to their data. Given some initial data, these models learn to perform linguistic annotation tasks themselves. They can then automatically perform those tasks on a much larger dataset, reducing the manual labour of a linguist. The models are not perfect and are designed to help bootstrap a linguistic project.
Cerberus currently supports the following tasks:
- **POS tagging**: Assigning a syntactic category to each word.
- **Translation**: Automatically translating from one language to another.
- **Classification**: Assigning a user-defined label to a word, sentence or paragraph.
Coming soon:
- **Spelling correction**: Correcting misspelt words.
- **Morphological analysis**: Assigning morphosyntactic features to each word.
- **Language modeling**: Generating grammatical sentences.
Cerberus is built on top of [AllenNLP](https://allennlp.org/) and [Streamlit](https://streamlit.io/).
