Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/AsoSoft/Vejinbooks-Poem-Dataset
A dataset of 1154 Central Kurdish poems with meter and form tags extracted from vejinbooks.com
https://github.com/AsoSoft/Vejinbooks-Poem-Dataset
kurdish vejinbooks
Last synced: 28 days ago
JSON representation
A dataset of 1154 Central Kurdish poems with meter and form tags extracted from vejinbooks.com
- Host: GitHub
- URL: https://github.com/AsoSoft/Vejinbooks-Poem-Dataset
- Owner: AsoSoft
- Created: 2018-12-04T03:13:22.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-12-29T20:16:48.000Z (almost 4 years ago)
- Last Synced: 2024-01-28T23:08:59.018Z (11 months ago)
- Topics: kurdish, vejinbooks
- Size: 869 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-kurdish - Dataset of Kurdish poems with meter and form tags
README
[![DOI](https://zenodo.org/badge/160289263.svg)](https://zenodo.org/badge/latestdoi/160289263)
# VejinBooks Poem Dataset
Dataset of 1154 Central Kurdish poems extracted from [vejinbooks](http://books.vejin.net) by AsoSoft group.
This dataset was used for evaluation of an automatic Kurdish poem meter identification method based on Optimality TheoryPrepared By: Aso Mahmudi (`aso.mehmudi[at]gmail.com`)
## Features of all poems:
* Language is Central Kurdish (ckb)
* Meter is specified manually by experts
* Form is specified manually
* non-Kurdish phrases are removed
* Length is more than 5 lines (more than 2 couplets)## File name convention
Poet + Poem-Number (for example: `Diɫdar002.txt`)## Text file format
* `title: ` Title of the poem in Kurdish (usually same as the first hemistich of the poem)
* `form: ` Form of the poem in Kurdish (e.g. غەزەل, مەسنەوی, نوێ)
* `meter: ` Meter pattern of the poem in English or Persian (e.g. 10Syllabic, مفاعیلن مفاعیلن مفاعیلن مفاعیلن)
* poem text starts in line 4.
* Number sign (`#`) means the start of a new couplet.
* Non-Kurdish hemistiches are replaced with two plus signs (`++`)