Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/centre-for-humanities-computing/odycy
A general-purpose NLP pipeline for Ancient Greek
https://github.com/centre-for-humanities-computing/odycy
ancient-greek machine-learning natural-language-processing nlp python spacy
Last synced: 11 days ago
JSON representation
A general-purpose NLP pipeline for Ancient Greek
- Host: GitHub
- URL: https://github.com/centre-for-humanities-computing/odycy
- Owner: centre-for-humanities-computing
- License: mit
- Created: 2023-01-09T10:15:23.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-26T14:29:10.000Z (8 months ago)
- Last Synced: 2024-07-30T20:18:18.706Z (4 months ago)
- Topics: ancient-greek, machine-learning, natural-language-processing, nlp, python, spacy
- Language: Python
- Homepage: https://centre-for-humanities-computing.github.io/odyCy/
- Size: 43.6 MB
- Stars: 17
- Watchers: 1
- Forks: 2
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
A general-purpose NLP pipeline for Ancient-Greek.
## Features :mount_fuji:
- [x] Part of speech tagging
- [x] Lemmatization
- [x] Dependency parsing
- [x] Morphological analysis
- [ ] Named entity recognition (work in progress :construction:)## Installation :sunrise:
OdyCy models can be directly installed from huggingface:
```bash
# To install the transformer-based pipeline
pip install https://huggingface.co/chcaa/grc_odycy_joint_trf/resolve/main/grc_odycy_joint_trf-any-py3-none-any.whl
# To install the tok2vec-based small pipeline
pip install https://huggingface.co/chcaa/grc_odycy_joint_sm/resolve/main/grc_odycy_joint_sm-any-py3-none-any.whl
```## Usage :whale:
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/centre-for-humanities-computing/odyCy/blob/main/tutorials/01_odycy_getting_started.ipynb#&offline=true&sandboxMode=true)
OdyCy pipelines can be imported with spaCy.
```python
import spacy# For the transformer-based pipeline
nlp = spacy.load("grc_odycy_joint_trf")# For a faster and smaller (but less accurate) tok2vec-based pipeline
nlp = spacy.load("grc_odycy_joint_sm")
```Pipelines can then be used as any other spaCy pipeline.
([spaCy Documentation](https://spacy.io/usage))Check out our Documentation on [Basic Usage](https://centre-for-humanities-computing.github.io/odyCy/getting_started.html).
## Performance :boat:
odyCy achieves state of the art performance on multiple tasks on unseen test data from the Universal Dependencies Perseus treebank,
and performs second best on the PROIEL treebank’s test set on even more tasks.
In addition performance also seems relatively stable across the two evaluation datasets in comparison with other NLP pipelines.For plots and tables on OdyCy's performance, check out the Documentation page on [Performance](https://centre-for-humanities-computing.github.io/odyCy/performance.html)