Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/imvladikon/spacy-trankit
💥 Trankit models directly in spaCy💥
https://github.com/imvladikon/spacy-trankit
nlp spacy spacy-extension spacy-nlp spacy-pipeline trankit
Last synced: 26 days ago
JSON representation
💥 Trankit models directly in spaCy💥
- Host: GitHub
- URL: https://github.com/imvladikon/spacy-trankit
- Owner: imvladikon
- License: mit
- Created: 2023-12-30T18:12:56.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-02-11T22:08:02.000Z (11 months ago)
- Last Synced: 2024-11-16T17:11:32.824Z (about 1 month ago)
- Topics: nlp, spacy, spacy-extension, spacy-nlp, spacy-pipeline, trankit
- Language: Python
- Homepage:
- Size: 28.3 KB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# spaCy + Trankit
This package wraps the [Trankit](https://github.com/nlp-uoregon/trankit) library, so you can use trankit models in a
[spaCy](https://spacy.io) pipeline.[//]: # ([![tests](https://github.com/imvladikon/spacy-trankit/actions/workflows/tests.yml/badge.svg)](https://github.com/imvladikon/spacy-trankit/actions/workflows/tests.yml))
[//]: # ([![PyPi](https://img.shields.io/pypi/v/spacy-trankit.svg?style=flat-square)](https://pypi.python.org/pypi/spacy-trankit))
[![GitHub](https://img.shields.io/github/release/imvladikon/spacy-trankit/all.svg?style=flat-square)](https://github.com/imvladikon/spacy-trankit)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square)](https://github.com/ambv/black)Using this wrapper, you'll be able to use the following annotations, computed by
your pretrained `trankit` pipeline/model:- Statistical tokenization (reflected in the `Doc` and its tokens)
- Lemmatization (`token.lemma` and `token.lemma_`)
- Part-of-speech tagging (`token.tag`, `token.tag_`, `token.pos`, `token.pos_`)
- Morphological analysis (`token.morph`)
- Dependency parsing (`token.dep`, `token.dep_`, `token.head`)
- Named entity recognition (`doc.ents`, `token.ent_type`, `token.ent_type_`,
`token.ent_iob`, `token.ent_iob_`)
- Sentence segmentation (`doc.sents`)## ️️️⌛️ Installation
As of v0.1.0 `spacy-trankit` is only compatible with **spaCy v3.x**. To install
the most recent version:```bash
pip install git+https://github.com/imvladikon/spacy-trankit
```or from pypi:
```bash
pip install spacy-trankit
```## 📖 Usage & Examples
Load pre-trained `trankit` model into a spaCy pipeline:
```python
import spacy_trankit# Initialize the pipeline
nlp = spacy_trankit.load("en")doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
for token in doc:
print(token.text, token.lemma_, token.pos_, token.dep_, token.ent_type_)
print(doc.ents)
```Load it from the path:
```python
import spacy_trankit# Initialize the pipeline
nlp = spacy_trankit.load_from_path(name="en", path="./cache")doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
for token in doc:
print(token.text, token.lemma_, token.pos_, token.dep_, token.ent_type_)
print(doc.ents)
```