Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pyurbans/urbans
A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.
https://github.com/pyurbans/urbans
artificial-intelligence data-science machine-translation nlp python
Last synced: 3 months ago
JSON representation
A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.
- Host: GitHub
- URL: https://github.com/pyurbans/urbans
- Owner: pyurbans
- License: apache-2.0
- Created: 2020-08-23T06:05:06.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-04-07T09:29:37.000Z (almost 3 years ago)
- Last Synced: 2024-09-25T06:38:46.021Z (5 months ago)
- Topics: artificial-intelligence, data-science, machine-translation, nlp, python
- Language: Python
- Homepage: https://github.com/pyurbans/urbans
- Size: 158 KB
- Stars: 21
- Watchers: 3
- Forks: 7
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# URBANS: Universal Rule-Based Machine Translation toolkit
**A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.***Why not translate it yourself when Google Translate cannot satisfy you❓*
[data:image/s3,"s3://crabby-images/874e5/874e57651ad1c17e1fd9e2f723ff7898a2db8c77" alt="CircleCI"](https://circleci.com/gh/pyurbans/urbans/tree/master)
[data:image/s3,"s3://crabby-images/4eba3/4eba3641242c7155d2ee7d204c2c9ca01559245d" alt="Codacy Badge"](https://www.codacy.com/gh/pyurbans/urbans?utm_source=github.com&utm_medium=referral&utm_content=pyurbans/urbans&utm_campaign=Badge_Grade)
[data:image/s3,"s3://crabby-images/678e0/678e0bc2860d4492917c3a28ac6983d592d45603" alt="Codacy Badge"](https://www.codacy.com/gh/pyurbans/urbans?utm_source=github.com&utm_medium=referral&utm_content=pyurbans/urbans&utm_campaign=Badge_Coverage)
[data:image/s3,"s3://crabby-images/ebfc5/ebfc57d994af35c9c17bb11de20d8fd180dbbeb9" alt="PyPI version"](https://badge.fury.io/py/urbans)
[data:image/s3,"s3://crabby-images/23a8e/23a8e51d8694f748dcb683d33b4335f8e5f73e33" alt="GitHub release"](https://GitHub.com/pyurbans/urbans/releases/)
[data:image/s3,"s3://crabby-images/96d44/96d447a7c3f0e855295a31c63570d40bcec4c880" alt="Maintenance"](https://GitHub.com/pyurbans/urbans/graphs/commit-activity)
[data:image/s3,"s3://crabby-images/17683/1768380119c0c89a3a7922cc565ecc789708079d" alt="License"](https://github.com/pyurbans/urbans/blob/master/LICENSE)## ⚙️ Installation
```bash
pip install urbans
```## ✨ What is good about urbans?
- Rule-based, deterministic translation; unlike Google Translate - giving only 1 non-deterministic result
- Using NLTK parsing interface and is built on top of already-efficient NLTK backend
- Can be used for data augmentation## 📖 Usage
```python
from urbans import Translator# Source sentence to be translated
src_sentences = ["I love good dogs", "I hate bad dogs"]# Source grammar in nltk parsing style
src_grammar = """
S -> NP VP
NP -> PRP
VP -> VB NP
NP -> JJ NN
PRP -> 'I'
VB -> 'love' | 'hate'
JJ -> 'good' | 'bad'
NN -> 'dogs'
"""# Some edit within source grammar to target grammar
src_to_target_grammar = {
"NP -> JJ NN": "NP -> NN JJ" # in Vietnamese NN goes before JJ
}# Word-by-word dictionary from source language to target language
en_to_vi_dict = {
"I":"tôi",
"love":"yêu",
"hate":"ghét",
"dogs":"những chú_chó",
"good":"ngoan",
"bad":"hư"
}translator = Translator(src_grammar = src_grammar,
src_to_tgt_grammar = src_to_target_grammar,
src_to_tgt_dictionary = en_to_vi_dict)trans_sentences = translator.translate(src_sentences)
# This should returns ['tôi yêu những chú_chó ngoan', 'tôi ghét những chú_chó hư']
```## ⚖️ License
This repository is using the Apache 2.0 license that is listed in the repo. Please take a look at [`LICENSE`](https://github.com/pyurbans/urbans/blob/master/LICENSE) as you wish.## ✍️ BibTeX
If you wish to cite the framework feel free to use this (but only if you loved it 😊):
```bibtex
@misc{phat2020urbans,
author = {Truong-Phat Nguyen},
title = {URBANS: Universal Rule-Based Machine Translation NLP toolkit},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/pyurbans/urbans}},
}
```## Contributors:
- Patrick Phat Nguyen