Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/suyashb95/wiktionaryparser
A Python Wiktionary Parser
https://github.com/suyashb95/wiktionaryparser
mediawiki parser python wiktionary-parser
Last synced: 8 days ago
JSON representation
A Python Wiktionary Parser
- Host: GitHub
- URL: https://github.com/suyashb95/wiktionaryparser
- Owner: suyashb95
- License: mit
- Archived: true
- Created: 2015-10-23T14:41:30.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2024-01-12T21:01:20.000Z (about 1 year ago)
- Last Synced: 2025-01-10T12:55:43.781Z (17 days ago)
- Topics: mediawiki, parser, python, wiktionary-parser
- Language: Python
- Homepage:
- Size: 511 KB
- Stars: 360
- Watchers: 12
- Forks: 93
- Open Issues: 35
-
Metadata Files:
- Readme: readme.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
Awesome Lists containing this project
README
### Wiktionary Parser
A python project which downloads words from English Wiktionary ([en.wiktionary.org](https://en.wiktionary.org)) and parses articles' content in an easy to use JSON format. Right now, it parses etymologies, definitions, pronunciations, examples, audio links and related words.
[![Downloads](http://pepy.tech/badge/wiktionaryparser)](http://pepy.tech/project/wiktionaryparser)
#### JSON structure
```json
[{
"pronunciations": {
"text": ["pronunciation text"],
"audio": ["pronunciation audio"]
},
"definitions": [{
"relatedWords": [{
"relationshipType": "word relationship type",
"words": ["list of related words"]
}],
"text": ["list of definitions"],
"partOfSpeech": "part of speech",
"examples": ["list of examples"]
}],
"etymology": "etymology text",
}]
```#### Installation
##### Using pip
* run `pip install wiktionaryparser`##### From Source
* Clone the repo or download the zip
* `cd` to the folder
* run `pip install -r "requirements.txt"`#### Usage
- Import the WiktionaryParser class.
- Initialize an object and use the `fetch("word", "language")` method.
- The default language is English, it can be changed using the `set_default_language method`.
- Include/exclude parts of speech to be parsed using `include_part_of_speech(part_of_speech)` and `exclude_part_of_speech(part_of_speech)`
- Include/exclude relations to be parsed using `include_relation(relation)` and `exclude_relation(relation)`#### Examples
```python
>>> from wiktionaryparser import WiktionaryParser
>>> parser = WiktionaryParser()
>>> word = parser.fetch('test')
>>> another_word = parser.fetch('test', 'french')
>>> parser.set_default_language('french')
>>> parser.exclude_part_of_speech('noun')
>>> parser.include_relation('alternative forms')
```#### Requirements
- requests==2.20.0
- beautifulsoup4==4.4.0#### Contributions
If you want to add features/improvement or report issues, feel free to send a pull request!
#### License
Wiktionary Parser is licensed under [MIT](LICENSE.txt).