Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/liebeck/spacy-iwnlp

German lemmatization with IWNLP as extension for spaCy
https://github.com/liebeck/spacy-iwnlp

nlp spacy spacy-extension spacy-pipeline

Last synced: 3 months ago
JSON representation

German lemmatization with IWNLP as extension for spaCy

Awesome Lists containing this project

README

        

# spacy-iwnlp
[![license](https://img.shields.io/github/license/mashape/apistatus.svg?maxAge=2592000)](https://github.com/Liebeck/spacy-iwnlp/master/LICENSE.md)
[![Build Status](https://api.travis-ci.org/Liebeck/spacy-iwnlp.svg?branch=master)](https://travis-ci.org/Liebeck/spacy-iwnlp)

This package uses the [spaCy 3.0 extensions](https://spacy.io/usage/processing-pipelines#extensions) to add [IWNLP-py](https://github.com/Liebeck/iwnlp-py) as German lemmatizer directly into your spaCy pipeline.

Please report bugs with spacy-iwnlp as issue in [IWNLP-py](https://github.com/Liebeck/iwnlp-py).

## Usage
``` python
import spacy
from spacy_iwnlp import spaCyIWNLP
nlp = spacy.load('de_core_news_sm')
nlp.add_pipe('iwnlp', config={'lemmatizer_path': 'data/IWNLP.Lemmatizer_20181001.json'})
doc = nlp('Wir mögen Fußballspiele mit ausgedehnten Verlängerungen.')
for token in doc:
print('POS: {}\tIWNLP:{}'.format(token.pos_, token._.iwnlp_lemmas))
```

## Installation
1. Use pip to install spacy-iwnlp
``` bash
pip install spacy-iwnlp
```
2. Download the latest processed IWNLP dump from https://dbs.cs.uni-duesseldorf.de/datasets/iwnlp/IWNLP.Lemmatizer_20181001.zip and unzip it.

## Local development
Use develop.py to extend the functionality

Update PIP package

```
python setup.py sdist bdist_wheel
python -m twine upload dist/PACKAGENAME-VERSION.tar.gz
```

## Citation
Please include the following BibTeX if you use IWNLP in your work:
``` bash
@InProceedings{liebeck-conrad:2015:ACL-IJCNLP,
author = {Liebeck, Matthias and Conrad, Stefan},
title = {{IWNLP: Inverse Wiktionary for Natural Language Processing}},
booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
year = {2015},
publisher = {Association for Computational Linguistics},
pages = {414--418},
url = {http://www.aclweb.org/anthology/P15-2068}
}
```