{"id":29019479,"url":"https://github.com/nullhawk/translator","last_synced_at":"2025-06-26T00:30:41.587Z","repository":{"id":298234583,"uuid":"996419243","full_name":"nullHawk/translator","owner":"nullHawk","description":"RNN Based Seq2Seq Model to Translate English to Hindi","archived":false,"fork":false,"pushed_at":"2025-06-10T03:54:56.000Z","size":8,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-10T04:31:54.485Z","etag":null,"topics":["nmt","rnn","seq2seq"],"latest_commit_sha":null,"homepage":"https://huggingface.co/spaces/nullHawk/english-hindi_translator","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nullHawk.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-04T23:36:20.000Z","updated_at":"2025-06-10T04:14:20.000Z","dependencies_parsed_at":"2025-06-10T04:41:56.950Z","dependency_job_id":null,"html_url":"https://github.com/nullHawk/translator","commit_stats":null,"previous_names":["nullhawk/translator"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nullHawk/translator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nullHawk%2Ftranslator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nullHawk%2Ftranslator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nullHawk%2Ftranslator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nullHawk%2Ftranslator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nullHawk","download_url":"https://codeload.github.com/nullHawk/translator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nullHawk%2Ftranslator/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261976602,"owners_count":23239160,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nmt","rnn","seq2seq"],"created_at":"2025-06-26T00:30:34.582Z","updated_at":"2025-06-26T00:30:41.315Z","avatar_url":"https://github.com/nullHawk.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# English to Hindi Neural Machine Translator\n\nThis project implements a neural machine translation system for translating English text to Hindi using a sequence-to-sequence (Seq2Seq) architecture with attention mechanism. The model is built with PyTorch and features a GRU-based encoder-decoder network.\n\n## Features\n\n- Seq2Seq architecture with attention mechanism\n- GRU-based encoder and decoder\n- Customizable model parameters (hidden size, embedding dimensions, etc.)\n- Interactive translation through command line interface\n- Web interface using Gradio\n- Support for vocabulary management and data preprocessing\n- Teacher forcing during training\n\n## Project Structure\n\n```\n├── app.py                 # Gradio web interface for translation\n├── inference.py           # Functions for model inference\n├── train.py               # Model training script\n├── models/                # Neural network architecture components\n│   ├── encoder.py         # Encoder implementation\n│   ├── decoder.py         # Decoder implementation  \n│   ├── attention.py       # Attention mechanism\n│   └── seq2seq.py         # Seq2Seq model\n├── utils/                 # Utility functions\n│   ├── config.py          # Configuration parameters\n│   ├── data_loader.py     # Data loading utilities\n│   └── preprocessing.py   # Text preprocessing functions\n├── data/                  # Data directory\n│   └── hindi_english_parallel.csv  # Parallel corpus\n└── bin/                   # Model checkpoints and vocabularies\n    ├── seq2seq.pth        # Trained model weights\n    ├── eng_vocab.pkl      # English vocabulary\n    └── hin_vocab.pkl      # Hindi vocabulary\n```\n\n## Installation\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/yourusername/english-hindi-translator.git\ncd english-hindi-translator\n```\n\n2. Install the required packages:\n```bash\npip install -r requirment.txt\n```\n\n## Usage\n\n### Training the Model\n\nTo train the translation model:\n\n```bash\npython train.py\n```\n\nThis will:\n- Load and preprocess the Hindi-English parallel corpus\n- Build vocabularies for both languages\n- Initialize and train the Seq2Seq model\n- Save model checkpoints after each epoch\n\n### Translation via Command Line\n\nFor interactive translation through command line:\n\n```bash\npython inference.py\n```\n\n### Web Interface\n\nTo launch the web interface for translation:\n\n```bash\npython app.py\n```\n\nThis will start a Gradio interface that you can access in your web browser.\n\n## Model Architecture\n\n- **Encoder**: GRU-based with configurable layers and embedding dimensions\n- **Decoder**: GRU with attention mechanism\n- **Attention**: Calculates attention scores between encoder outputs and decoder hidden states\n- **Training**: Uses teacher forcing and cross-entropy loss\n\n## Configuration\n\nModel parameters can be adjusted in the config.py file:\n\n- `embedding_dim`: Size of word embeddings\n- `hidden_size`: Size of hidden layers\n- `num_layers`: Number of RNN layers\n- `dropout`: Dropout rate\n- `batch_size`: Training batch size\n- `learning_rate`: Learning rate for optimizer\n- `epochs`: Number of training epochs\n- `teacher_forcing_ratio`: Ratio of teacher forcing during training\n- `max_vocab_english`: Maximum size of English vocabulary\n- `max_vocab_hindi`: Maximum size of Hindi vocabulary\n- `max_length`: Maximum sentence length\n\n## Requirements\n\n- torch\n- pandas\n- numpy\n- tqdm\n- gradio","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnullhawk%2Ftranslator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnullhawk%2Ftranslator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnullhawk%2Ftranslator/lists"}