Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lennon-c/de_wiktio
A Python package to parse and extract data from the German Wiktionary. It allows users to access wikitext content, either by fetching it directly online or by loading a dump file locally.
https://github.com/lennon-c/de_wiktio
dewiktionary wikitext wiktionary wiktionary-dump wiktionary-parser
Last synced: about 1 month ago
JSON representation
A Python package to parse and extract data from the German Wiktionary. It allows users to access wikitext content, either by fetching it directly online or by loading a dump file locally.
- Host: GitHub
- URL: https://github.com/lennon-c/de_wiktio
- Owner: lennon-c
- License: mit
- Created: 2024-11-20T07:34:08.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-20T14:13:23.000Z (2 months ago)
- Last Synced: 2024-11-20T15:26:21.588Z (2 months ago)
- Topics: dewiktionary, wikitext, wiktionary, wiktionary-dump, wiktionary-parser
- Language: Python
- Homepage: https://lennon-c.github.io/de_wiktio/
- Size: 679 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: docs/README.md
- License: LICENSE
Awesome Lists containing this project
README
`de_wiktio` is a Python package designed to parse and extract data from the *German Wiktionary*. It enables users to access *wikitext* content either by fetching it directly online or by preprocessing and loading dump files locally for faster access. It can extract linguistic data such as parts of speech, inflections, examples, definitions, among others.
This package can be thought of as a companion to the [Hands-on Guide](https://lennon-c.github.io/python-wikitext-parser-guide); all the steps and code covered in the guide are implemented here as a package.
## Installation
The package was created using Python 11, so make sure that you have Python 11 or later. You can install the package from the my GitHub repo. The following code will install the package and its dependencies (i.e. `requests`, `lxml`, and `mwparserfromhell`):```bash
pip install git+https://github.com/lennon-c/de_wiktio.git
```## Documentation
In the documentation site, you can find:- [Usage examples](https://lennon-c.github.io/de_wiktio/usage/) and
- [The API documentation](https://lennon-c.github.io/de_wiktio/API/).## The Story Behind This Package
I created this package as a personal project to extract inflection tables, which I use in my flashcard system for learning German.