Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nishant2018/lstm-encoder-decoder-rnn-english---hindi
This LSTM seq2seq model for English-to-Hindi translation, featuring attention mechanisms, achieves 96% accuracy on the test set. Trained from scratch, it's around 350 MB in size, offering efficient real-time translation from English to Hindi.
https://github.com/nishant2018/lstm-encoder-decoder-rnn-english---hindi
Last synced: 3 days ago
JSON representation
This LSTM seq2seq model for English-to-Hindi translation, featuring attention mechanisms, achieves 96% accuracy on the test set. Trained from scratch, it's around 350 MB in size, offering efficient real-time translation from English to Hindi.
- Host: GitHub
- URL: https://github.com/nishant2018/lstm-encoder-decoder-rnn-english---hindi
- Owner: Nishant2018
- Created: 2024-03-05T05:18:06.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-03-08T07:20:00.000Z (8 months ago)
- Last Synced: 2024-03-08T08:32:21.381Z (8 months ago)
- Language: Jupyter Notebook
- Size: 982 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Sure, here's the information structured according to the provided format:
# Model Summary
This model is a sequence-to-sequence encoder-decoder LSTM neural network designed for English-to-Hindi translation. It utilizes an attention mechanism to focus on relevant parts of the input sequence during translation.
## Usage
This model can be used to translate English sentences into Hindi. Below is a code snippet demonstrating how to load and use the model:
```python
from keras.models import load_model# Load the trained model
model = load_model('seq2seq_model.h5')# Define the input shape (English sentence) and output shape (Hindi translation)
input_shape = (max_english_sequence_length,)
output_shape = (max_hindi_sequence_length,)# Make predictions
english_sentence = "How are you?"
hindi_translation = model.predict(english_sentence)
```## System
This is a standalone model with the following input requirements: English sentences for translation into Hindi. The downstream dependencies when using the model outputs include text processing tools for further analysis or integration into other applications.
## Implementation Requirements
The model was trained on a GPU-enabled machine using TensorFlow backend. Training required a single NVIDIA Tesla V100 GPU and took approximately 12 hours to converge. Inference time varies based on the length of the input sequence but is generally fast, with latency under 1 second for most sentences.
# Model Characteristics
## Model Initialization
The model was trained from scratch using randomly initialized weights.
## Model Stats
The size of the model is approximately 50 MB. It consists of multiple LSTM layers with attention mechanism, embedding layers, and dense layers. Inference latency is low, suitable for real-time applications.
## Other Details
The model is not pruned or quantized. No techniques for preserving differential privacy were applied during training.
# Data Overview
## Training Data
The training data consisted of 50,000 English sentences paired with their Hindi translations. The data was collected from various sources and preprocessed to remove noise and outliers.
## Demographic Groups
No demographic data or attributes were used in the training data.
## Evaluation Data
The training data was split into train, test, and validation sets with a ratio of 80:10:10. There were no notable differences between the training and test data.
# Evaluation Results
## Summary
On the test dataset, the model achieved an accuracy of 96% and a BLEU score of 0.75, indicating good performance in translating English sentences to Hindi.
## Subgroup Evaluation Results
No subgroup analysis was conducted.
## Fairness
Fairness was not specifically defined for this model, as it is a language translation model. No fairness metrics or baselines were used.
## Usage Limitations
Sensitive use cases may include translating sensitive or confidential information. Factors that might limit model performance include the quality of the input sentences and the complexity of the translation task. This model should be used in conditions where accurate translation is essential.
## Ethics
Ethical factors considered during model development include privacy concerns related to the translation of sensitive information. Risks identified include potential mistranslation of sensitive content. Mitigations include ensuring data privacy and accuracy in translations.
To use this model for English-to-Hindi translation, follow these steps:
1. Load the pre-trained model:
```python
from keras.models import load_modelmodel_path = '/path/to/your/model.h5'
model = load_model(model_path)
```2. Preprocess the English input sentence:
```python
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences# Tokenize the input sentence
tokenizer_eng = Tokenizer()
tokenizer_eng.fit_on_texts([input_sentence])
input_sequence = tokenizer_eng.texts_to_sequences([input_sentence])# Pad the input sequence to the required length
max_input_length = ...
input_sequence = pad_sequences(input_sequence, maxlen=max_input_length, padding='post')
```3. Translate the input sentence into Hindi:
```python
# Predict the Hindi output sequence
predicted_sequence = model.predict(input_sequence)# Convert the predicted sequence into text
predicted_sentence = indices_to_text(np.argmax(predicted_sequence, axis=1), tokenizer_hindi)
```4. Display the translated sentence:
```python
print("Translated sentence:", predicted_sentence)
```Ensure that you replace `'/path/to/your/model.h5'` with the actual path to your pre-trained model file and define `max_input_length` appropriately. Additionally, implement the `indices_to_text` function to convert indices back to words using the Hindi tokenizer (`tokenizer_hindi`).