Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sagorbrur/bntranslit
Bangla Transliteration Package
https://github.com/sagorbrur/bntranslit
bangla bangla-transliteration bengali bengali-transliteration deep-learning nlp pytorch transliteration
Last synced: 4 days ago
JSON representation
Bangla Transliteration Package
- Host: GitHub
- URL: https://github.com/sagorbrur/bntranslit
- Owner: sagorbrur
- License: mit
- Created: 2021-04-28T16:52:35.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2021-05-05T14:03:27.000Z (over 3 years ago)
- Last Synced: 2024-12-17T03:34:50.406Z (17 days ago)
- Topics: bangla, bangla-transliteration, bengali, bengali-transliteration, deep-learning, nlp, pytorch, transliteration
- Language: Python
- Homepage:
- Size: 35.3 MB
- Stars: 8
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: license
Awesome Lists containing this project
README
# BNTRANSLIT
__BNTRANSLIT__ is a deep learning based transliteration app for Bangla word.## Installation
`pip install bntranslit`## Dependency
- pytorch 1.7.0 or 1.7.0+NB: No `GPU` Needed. Totally `CPU` based
## Pre-trained Model
- [Download bntranslit_model](https://github.com/sagorbrur/bntranslit/raw/master/model/bntranslit_model.pth)## Usage
```py
from bntranslit import BNTransliterationmodel_path = "bntranslit_model.pth"
bntrans = BNTransliteration(model_path)word = "aami"
output = bntrans.predict(word, topk=10)
# output: ['আমি', 'আমী', 'অ্যামি', 'আমিই', 'এমি', 'আমির', 'আমিদ', 'আমই', 'আমে', 'আমিতে']```
## Datasets and Training Details
- We used [Google Dakshina Dataset](https://github.com/google-research-datasets/dakshina)
- Thanks to [AI4Bharat](https://github.com/AI4Bharat/IndianNLP-Transliteration) for providing training notebook with details explanation
- We trained Google Bangla Dakshina lexicons train datasets for 10 epochs with batch size 128, 1e-3, embedding dim = 300, hidden dim = 512, lstm, used attention
- We evaluated our trained model with Google Bangla Dakshina lexicon test data using [AI4Bharat evaluation script](https://raw.githubusercontent.com/AI4Bharat/IndianNLP-Transliteration/jgeob-dev/tools/accuracy_reporter/accuracy_news.py) and our evaluation results insides `docs/evaluation_summary.txt`