Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/global-asp/pb-source

Pratham Books stories in Markdown format
https://github.com/global-asp/pb-source

corpus creative-commons india multilingual storybooks

Last synced: about 1 month ago
JSON representation

Pratham Books stories in Markdown format

Awesome Lists containing this project

README

        

# Source stories from the Pratham Books collection in Markdown format

This repository makes available the source texts of open-licensed stories from [Pratham Books](http://prathambooks.org/) in Markdown format.

Each folder in the repository represents a language, identified by its [ISO 639-1](http://en.wikipedia.org/wiki/ISO_639-1) or [ISO 639-3](http://en.wikipedia.org/wiki/ISO_639-3) code. Source translations into each language are stored in the appropriate folders.

All of these source texts have been extracted from the epub files available on the [Storyweaver](https://storyweaver.org.in) website. The markdown files in this repo provide data for many other projects, for example the translations in the [Global Pratham Books Project](https://github.com/global-asp/global-pb), the [PB Image Bank Explorer](https://github.com/dohliam/pb-imagebank-explorer), as well as making possible the easy creation of bilingual storybooks in any language combination.

Corresponding images for the stories in this repository can be found in the [Pratham Books Image Bank](https://github.com/global-asp/pb-imagebank).

## Format

The extracted source text of all stories has been provided here in Markdown format. See [here](https://github.com/global-asp/global-asp#source-format) for specific details about the format used.

A sequence of two hashes `##` on a separate line indicates a page break.

Editing of the story content has been kept to a minimum and for the most part the stories are presented as they are. Corrections other than obvious errors of orthography or traces of the conversion process should be directed to Pratham Books through the [Storyweaver](https://storyweaver.org.in) website directly.

## Languages

Pratham Books currently provides stories in 35 different languages. This repository attempts to provide the source text for all of these stories in machine- and human-readable Markdown format.

Below is a key to the languages covered by this repository and their ISO 639-1/3 codes.

ISO code | Language Name
-------- | -------------
[as](https://github.com/global-asp/pb-source/tree/master/as) | Assamese
[bn](https://github.com/global-asp/pb-source/tree/master/bn) | Bengali
[en](https://github.com/global-asp/pb-source/tree/master/en) | English
[gu](https://github.com/global-asp/pb-source/tree/master/gu) | Gujarati
[hi](https://github.com/global-asp/pb-source/tree/master/hi) | Hindi
[kn](https://github.com/global-asp/pb-source/tree/master/kn) | Kannada
[kok](https://github.com/global-asp/pb-source/tree/master/kok) | Konkani
[kru](https://github.com/global-asp/pb-source/tree/master/kru) | Kurukh
[ml](https://github.com/global-asp/pb-source/tree/master/ml) | Malayalam
[mqu](https://github.com/global-asp/pb-source/tree/master/mqu) | Mundari
[mr](https://github.com/global-asp/pb-source/tree/master/mr) | Marathi
[or](https://github.com/global-asp/pb-source/tree/master/or) | Oriya
[pa](https://github.com/global-asp/pb-source/tree/master/pa) | Punjabi
[sa](https://github.com/global-asp/pb-source/tree/master/sa) | Sanskrit
[sck](https://github.com/global-asp/pb-source/tree/master/sck) | Sadri
[ta](https://github.com/global-asp/pb-source/tree/master/ta) | Tamil
[te](https://github.com/global-asp/pb-source/tree/master/te) | Telugu

## License

All stories in this repository are [Creative Commons](https://creativecommons.org/) licensed (CC-BY 4.0) with the exception of several stories that are Public Domain. The specific license for each story is indicated both in the metadata section at the bottom of each file, as well as in the corresponding `README.md` file for that language. Direct links to the original stories on the Pratham Books website can also be found in the `README.md` files.