Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/liebeck/spacy-iwnlp
German lemmatization with IWNLP as extension for spaCy
https://github.com/liebeck/spacy-iwnlp
nlp spacy spacy-extension spacy-pipeline
Last synced: 3 months ago
JSON representation
German lemmatization with IWNLP as extension for spaCy
- Host: GitHub
- URL: https://github.com/liebeck/spacy-iwnlp
- Owner: Liebeck
- License: mit
- Created: 2018-03-06T12:03:06.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2023-07-28T18:16:41.000Z (over 1 year ago)
- Last Synced: 2024-10-14T04:03:30.054Z (3 months ago)
- Topics: nlp, spacy, spacy-extension, spacy-pipeline
- Language: Python
- Homepage:
- Size: 19.5 KB
- Stars: 23
- Watchers: 5
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
- Citation: CITATION.cff
Awesome Lists containing this project
README
# spacy-iwnlp
[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/Liebeck/spacy-iwnlp/master/LICENSE.md)
[![Build Status](https://api.travis-ci.org/Liebeck/spacy-iwnlp.svg?branch=master)](https://travis-ci.org/Liebeck/spacy-iwnlp)This package uses the [spaCy 3.0 extensions](https://spacy.io/usage/processing-pipelines#extensions) to add [IWNLP-py](https://github.com/Liebeck/iwnlp-py) as German lemmatizer directly into your spaCy pipeline.
Please report bugs with spacy-iwnlp as issue in [IWNLP-py](https://github.com/Liebeck/iwnlp-py).
## Usage
``` python
import spacy
from spacy_iwnlp import spaCyIWNLP
nlp = spacy.load('de_core_news_sm')
nlp.add_pipe('iwnlp', config={'lemmatizer_path': 'data/IWNLP.Lemmatizer_20181001.json'})
doc = nlp('Wir mögen Fußballspiele mit ausgedehnten Verlängerungen.')
for token in doc:
print('POS: {}\tIWNLP:{}'.format(token.pos_, token._.iwnlp_lemmas))
```## Installation
1. Use pip to install spacy-iwnlp
``` bash
pip install spacy-iwnlp
```
2. Download the latest processed IWNLP dump from https://dbs.cs.uni-duesseldorf.de/datasets/iwnlp/IWNLP.Lemmatizer_20181001.zip and unzip it.## Local development
Use develop.py to extend the functionalityUpdate PIP package
```
python setup.py sdist bdist_wheel
python -m twine upload dist/PACKAGENAME-VERSION.tar.gz
```## Citation
Please include the following BibTeX if you use IWNLP in your work:
``` bash
@InProceedings{liebeck-conrad:2015:ACL-IJCNLP,
author = {Liebeck, Matthias and Conrad, Stefan},
title = {{IWNLP: Inverse Wiktionary for Natural Language Processing}},
booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
year = {2015},
publisher = {Association for Computational Linguistics},
pages = {414--418},
url = {http://www.aclweb.org/anthology/P15-2068}
}
```