Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pyurbans/urbans
A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.
https://github.com/pyurbans/urbans
artificial-intelligence data-science machine-translation nlp python
Last synced: 3 months ago
JSON representation
A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.
- Host: GitHub
- URL: https://github.com/pyurbans/urbans
- Owner: pyurbans
- License: apache-2.0
- Created: 2020-08-23T06:05:06.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-04-07T09:29:37.000Z (almost 3 years ago)
- Last Synced: 2024-09-25T06:38:46.021Z (5 months ago)
- Topics: artificial-intelligence, data-science, machine-translation, nlp, python
- Language: Python
- Homepage: https://github.com/pyurbans/urbans
- Size: 158 KB
- Stars: 21
- Watchers: 3
- Forks: 7
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# URBANS: Universal Rule-Based Machine Translation toolkit
**A tool for translating text from source grammar to target grammar (context-free) with corresponding dictionary.***Why not translate it yourself when Google Translate cannot satisfy you❓*
[](https://circleci.com/gh/pyurbans/urbans/tree/master)
[](https://www.codacy.com/gh/pyurbans/urbans?utm_source=github.com&utm_medium=referral&utm_content=pyurbans/urbans&utm_campaign=Badge_Grade)
[](https://www.codacy.com/gh/pyurbans/urbans?utm_source=github.com&utm_medium=referral&utm_content=pyurbans/urbans&utm_campaign=Badge_Coverage)
[](https://badge.fury.io/py/urbans)
[](https://GitHub.com/pyurbans/urbans/releases/)
[](https://GitHub.com/pyurbans/urbans/graphs/commit-activity)
[](https://github.com/pyurbans/urbans/blob/master/LICENSE)## ⚙️ Installation
```bash
pip install urbans
```## ✨ What is good about urbans?
- Rule-based, deterministic translation; unlike Google Translate - giving only 1 non-deterministic result
- Using NLTK parsing interface and is built on top of already-efficient NLTK backend
- Can be used for data augmentation## 📖 Usage
```python
from urbans import Translator# Source sentence to be translated
src_sentences = ["I love good dogs", "I hate bad dogs"]# Source grammar in nltk parsing style
src_grammar = """
S -> NP VP
NP -> PRP
VP -> VB NP
NP -> JJ NN
PRP -> 'I'
VB -> 'love' | 'hate'
JJ -> 'good' | 'bad'
NN -> 'dogs'
"""# Some edit within source grammar to target grammar
src_to_target_grammar = {
"NP -> JJ NN": "NP -> NN JJ" # in Vietnamese NN goes before JJ
}# Word-by-word dictionary from source language to target language
en_to_vi_dict = {
"I":"tôi",
"love":"yêu",
"hate":"ghét",
"dogs":"những chú_chó",
"good":"ngoan",
"bad":"hư"
}translator = Translator(src_grammar = src_grammar,
src_to_tgt_grammar = src_to_target_grammar,
src_to_tgt_dictionary = en_to_vi_dict)trans_sentences = translator.translate(src_sentences)
# This should returns ['tôi yêu những chú_chó ngoan', 'tôi ghét những chú_chó hư']
```## ⚖️ License
This repository is using the Apache 2.0 license that is listed in the repo. Please take a look at [`LICENSE`](https://github.com/pyurbans/urbans/blob/master/LICENSE) as you wish.## ✍️ BibTeX
If you wish to cite the framework feel free to use this (but only if you loved it 😊):
```bibtex
@misc{phat2020urbans,
author = {Truong-Phat Nguyen},
title = {URBANS: Universal Rule-Based Machine Translation NLP toolkit},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/pyurbans/urbans}},
}
```## Contributors:
- Patrick Phat Nguyen