{"id":21457812,"url":"https://github.com/amha-kindu/terguami","last_synced_at":"2026-04-15T18:02:08.175Z","repository":{"id":263184522,"uuid":"889598645","full_name":"amha-kindu/terguami","owner":"amha-kindu","description":"Terguami is a FastAPI application powered by a custom transformer model built with PyTorch. It provides fast and accurate English-to-Amharic translations, with features like Docker support and interactive OpenAPI documentation, making it a scalable and easy-to-deploy solution for machine translation.","archived":false,"fork":false,"pushed_at":"2024-11-17T20:25:07.000Z","size":591,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-24T04:59:39.907Z","etag":null,"topics":["amharic-language","attention-is-all-you-need","docker","english-language","fastapi","machine-learning","model-training-and-evaluation","python","pytorch","swagger","transformer-architecture","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amha-kindu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-16T18:26:54.000Z","updated_at":"2024-11-17T20:25:11.000Z","dependencies_parsed_at":"2024-11-20T12:45:39.622Z","dependency_job_id":null,"html_url":"https://github.com/amha-kindu/terguami","commit_stats":null,"previous_names":["amha-kindu/terguami"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/amha-kindu/terguami","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amha-kindu%2Fterguami","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amha-kindu%2Fterguami/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amha-kindu%2Fterguami/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amha-kindu%2Fterguami/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amha-kindu","download_url":"https://codeload.github.com/amha-kindu/terguami/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amha-kindu%2Fterguami/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31853279,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"ssl_error","status_checked_at":"2026-04-15T15:24:39.138Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amharic-language","attention-is-all-you-need","docker","english-language","fastapi","machine-learning","model-training-and-evaluation","python","pytorch","swagger","transformer-architecture","transformers"],"created_at":"2024-11-23T06:07:11.719Z","updated_at":"2026-04-15T18:02:08.156Z","avatar_url":"https://github.com/amha-kindu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Terguami: English-to-Amharic Translation API\n\u003cp\u003e\n  \u003cimg src=\"https://img.shields.io/badge/FastAPI-v1.0.0-brightgreen.svg\" alt=\"FastAPI\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Python-3.10-blue.svg\" alt=\"Python\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/PyTorch-2.1.2-orange.svg\" alt=\"PyTorch\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Docker-Supported-blue.svg\" alt=\"Docker\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/License-MIT-lightgrey.svg\" alt=\"License\"\u003e\n\u003c/p\u003e\n\nThis repository provides an English-to-Amharic translation API built with **FastAPI**. The core of the application is a custom transformer-based model trained from scratch using **PyTorch** on a carefully curated dataset. It offers a robust, scalable solution for translating English text into Amharic, with performance and extensibility in mind.\n\n---\n\n## ✨ Features\n\n- 🚀 **High-Performance API**: Powered by FastAPI for asynchronous, fast, and efficient HTTP requests.\n- 🧠 **Custom Transformer Model**: A transformer-based machine learning model developed from scratch with PyTorch.\n- 📈 **Custom Dataset**: The model is trained on a carefully curated dataset to ensure accurate English-to-Amharic translations.\n- 📜 **Extensible API**: Modular structure for easy integration with other systems or expansion with new features.\n- 🔄 **OpenAPI Documentation**: Automatically generated and customizable API docs for developers.\n\n---\n\n## 🖼️ Demo Images\n\n### API in Action\n![API in Action](assets/translate_demo.png \"Demo of the Translation API in Action\")\n\n---\n\n## Table of Contents\n\n1. [Installation](#installation)\n2. [Usage](#usage)\n3. [Running with Docker](#running-with-docker)\n4. [Endpoints](#endpoints)\n5. [Model Details](#model-details)\n6. [Future Improvements](#future-improvements)\n7. [License](#license)\n\n---\n\n## ⚙️ Installation\n\n### Prerequisites\n- **Python 3.10+**\n- **pip** package manager\n\n### Clone the Repository\n```bash\ngit clone https://github.com/amha-kindu/terguami.git\ncd terguami\n```\n\n### Install Dependencies\n```bash\npip install -r requirements.txt\n```\n\n**`requirements.txt`**\n```\nfastapi==0.115.5\nuvicorn==0.32.2\npydantic==2.9.2\npydantic-settings==2.6.1\nnumpy==1.26.3\ntokenizers==0.15.0\ntorch==2.1.2\n```\n\n---\n\n## 🚀 Usage\n\n### Running the Server\nTo start the server locally, run:\n```bash\nuvicorn main:app --host 0.0.0.0 --port 7000 --reload\n```\n\nOnce running, the API will be accessible at `http://localhost:7000`.\n\n---\n\n## 🐳 Running with Docker\n\n### Prerequisites\n- **Docker** installed on your machine.\n\n### Build the Docker Image\n```bash\ndocker build -t terguami:v1 .\n```\n\n### Run the Docker Container\n```bash\ndocker run -d -p 7000:7000 --name terguami-api terguami:v1\n```\n\nThe API will now be accessible at `http://localhost:7000`.\n\n---\n\n### API Documentation\nVisit `http://localhost:7000/docs` for interactive OpenAPI documentation, or `http://localhost:7000/redoc` for an alternative documentation view.\n\n## 🌐 Endpoints\n\n### Health Check\n**`GET /health`**\n\nCheck the health status of the API.\n- **Response**: \n  ```json\n  { \"status\": \"healthy\" }\n  ```\n\n### Translation\n**`POST /translate`**\n\nTranslate English text to Amharic.\n- **Request Body**:\n  ```json\n  {\n    \"text\": \"Hello, how are you?\"\n  }\n  ```\n- **Response**:\n  ```json\n  {\n    \"original_text\":\"Hello, how are you?\",\n    \"translated_text\": \"ሰላም፣ እንዴት ነህ/ነሽ?\"\n  }\n  ```\n\n---\n\n\n## 🧠 Model Details\n\nThe translation model is a custom transformer-based architecture implemented in **PyTorch**, inspired by the \"Attention Is All You Need\" paper. It employs an encoder-decoder structure with multi-head self-attention in the encoder and cross-attention in the decoder to capture complex contextual relationships.\n\nKey highlights include:\n\n- **Custom Tokenizer**: Uses subword tokenization (Byte Pair Encoding) for robust handling of rare and compound words.\n- **Preprocessing**: Applied Text preprocessing including lowercasing, abbreviation normalization, normalization of character level mismatch, punctuation \u0026 special character removal, and stopword removal for cleaner and consistent input data.\n- **Dataset**: Trained on a curated English-Amharic parallel corpus, preprocessed to reduce noise and normalize tokens, with an 80-15-5 train-test-validation split.\n- **Training**: Optimized using **AdamW** with a warm-up scheduler and cross-entropy loss with label smoothing to enhance generalization.\n- **Inference**: Features beam search for improved accuracy and greedy decoding for faster results.\n\nThe model is further optimized for deployment with quantization to minimize latency and memory usage, ensuring efficiency in production environments.\n\n---\n\n## 🔨 Training Code\n\nThe training code for the custom transformer model is available in a separate repository by the same author. This code includes the full pipeline for training the model on a custom curated English-Amharic dataset, from data preprocessing to model optimization.\n\nYou can find the repository containing the training code here:\n- [Training Code Repository](https://github.com/amha-kindu/machine-translation)\n\nFeel free to explore the training process, contribute, or adapt the code for your own use!\n\n---\n\n## 🚧 Future Improvements\n\n- 🚧 Add support for **bi-directional translation** (Amharic-to-English).\n- ⚙️ Optimize inference speed using **ONNX Runtime** or **TorchScript**.\n- 📊 Deploy on **AWS Lambda** for serverless scalability.\n- 🔍 Implement advanced logging and monitoring with tools like **Prometheus**.\n\n---\n\n## 📜 License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n---\n\n## 👥 Contributors\n\nDeveloped and maintained by [Amha Kindu](https://github.com/amha-kindu).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famha-kindu%2Fterguami","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famha-kindu%2Fterguami","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famha-kindu%2Fterguami/lists"}