https://github.com/tushar2704/hinglish
This project focuses on building a Neural Machine Translation (NMT) system to translate English sentences to Hindi.
https://github.com/tushar2704/hinglish
data-science hindi-english-translation nlp nmt python
Last synced: about 1 month ago
JSON representation
This project focuses on building a Neural Machine Translation (NMT) system to translate English sentences to Hindi.
- Host: GitHub
- URL: https://github.com/tushar2704/hinglish
- Owner: tushar2704
- License: apache-2.0
- Created: 2023-08-10T10:34:50.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-08-16T11:01:18.000Z (almost 3 years ago)
- Last Synced: 2024-12-27T08:16:45.209Z (over 1 year ago)
- Topics: data-science, hindi-english-translation, nlp, nmt, python
- Language: Jupyter Notebook
- Homepage: https://tushar-aggarwal.com
- Size: 33.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# English to Hindi Neural Machine Translation












## Overview
This project focuses on building a Neural Machine Translation (NMT) system to translate English sentences to Hindi. NMT has revolutionized the field of language translation by leveraging deep learning techniques to produce more accurate and natural-sounding translations.
## Features
- **Encoder-Decoder Architecture**: The NMT system employs an encoder-decoder architecture, where the encoder encodes the input English sentence into a fixed-size context vector, and the decoder generates the corresponding Hindi translation from the context vector.
- **Attention Mechanism**: To handle longer sentences and capture relevant information effectively, an attention mechanism is integrated. This allows the model to focus on different parts of the input sentence while generating the output.
- **Data Preprocessing**: The project includes data preprocessing steps to clean and normalize input sentences, ensuring better alignment and accuracy in translation.
- **Training and Evaluation**: The model is trained on a parallel corpus of English-Hindi sentence pairs. During training, the model learns to minimize the translation loss. The evaluation process demonstrates the model's translation quality with selected input sentences.
- **Visualization of Attention**: The project offers a visualization of attention weights, showing how the model attends to different parts of the input during translation.
## Usage
1. **Data Preparation**: Prepare your parallel corpus of English-Hindi sentence pairs. Ensure that your data is properly formatted and cleaned.
2. **Model Configuration**: Set up the encoder and attention-based decoder architecture in the code. Define the hyperparameters, such as hidden size, learning rate, and dropout rate.
3. **Training**: Train the model using the provided training functions. Adjust the number of training iterations, print intervals, and other parameters as needed.
4. **Evaluation and Visualization**: Evaluate the model's translation quality using the `evaluateAndShowAttention` function. Provide your English input sentences and observe both the translated output and attention visualization.
## Dependencies
- Python 3.x
- PyTorch
- Matplotlib
## Contributing
Contributions to this project are welcome! Whether it's improving the model's performance, enhancing the visualization, or extending the features, your contributions can make a significant impact.
## License
This project is licensed under the [MIT License](LICENSE).
**Refrences**
- [Learning Phrase Representations using RNN Encoder-Decoder for
Statistical Machine Translation](https://arxiv.org/abs/1406.1078)_
- [Sequence to Sequence Learning with Neural
Networks](https://arxiv.org/abs/1409.3215)_
- [Neural Machine Translation by Jointly Learning to Align and
Translate](https://arxiv.org/abs/1409.0473)_
- [A Neural Conversational Model](https://arxiv.org/abs/1506.05869)_
## Author
- ©2023 Tushar Aggarwal. All rights reserved
- [LinkedIn](https://www.linkedin.com/in/tusharaggarwalinseec/)
- [Medium](https://medium.com/@tushar_aggarwal)
- [Tushar-Aggarwal.com](https://www.tushar-aggarwal.com/)
- [New Kaggle](https://www.kaggle.com/tagg27)
## Contact me!
If you have any questions, suggestions, or just want to say hello, you can reach out to us at [Tushar Aggarwal](mailto:info@tushar-aggarwal.com). We would love to hear from you!