https://github.com/jfilter/german-preprocessing
🇩🇪 Preprocess German texts to do some serious natural-language processing.
https://github.com/jfilter/german-preprocessing
german nlp package python
Last synced: 11 months ago
JSON representation
🇩🇪 Preprocess German texts to do some serious natural-language processing.
- Host: GitHub
- URL: https://github.com/jfilter/german-preprocessing
- Owner: jfilter
- License: mit
- Created: 2019-07-30T10:09:22.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2022-12-09T05:18:39.000Z (over 3 years ago)
- Last Synced: 2025-07-05T11:54:57.478Z (11 months ago)
- Topics: german, nlp, package, python
- Language: Python
- Homepage:
- Size: 37.1 KB
- Stars: 11
- Watchers: 5
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# German Preprocessing [](https://travis-ci.com/jfilter/german-preprocessing) [](https://pypi.org/project/german/) [](https://pypi.org/project/german/)
Preprocess German texts to do some serious natural-language processing.
- [clean texts](https://github.com/jfilter/clean-text)
- remove stopwords (as defined by [spaCy](https://github.com/explosion/spaCy/blob/master/spacy/lang/de/stop_words.py))
- [lemmatize](https://github.com/jfilter/german-lemmatizer)
- lower-case, and remove all punctions, digits are replaced with "0"
## Installation
`pip install german`
## Usage
```python
from german import preprocess
preprocess(['Johannes war einer von vielen guten Schülern.', 'Julia trinkt gern Tee.'], remove_stop=True)
# ['johannes gut schüler', 'julia trinken tee']
```
## License
MIT.
## Sponsoring
This work was created as part of a [project](https://github.com/jfilter/ptf) that was funded by the German [Federal Ministry of Education and Research](https://www.bmbf.de/en/index.html).