{"id":24583776,"url":"https://github.com/assemblyai-solutions/translation-benchmarks","last_synced_at":"2025-03-17T17:25:14.091Z","repository":{"id":226332403,"uuid":"768356554","full_name":"AssemblyAI-Solutions/translation-benchmarks","owner":"AssemblyAI-Solutions","description":"This project benchmarks different translation services on the CoVoST dataset. The goal is to compare the quality of translations provided by different vendors along with average latency. The quality is measured using BLEU scores.","archived":false,"fork":false,"pushed_at":"2024-03-07T02:44:12.000Z","size":8,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-24T04:53:32.913Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AssemblyAI-Solutions.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-03-06T23:45:08.000Z","updated_at":"2024-03-06T23:52:36.000Z","dependencies_parsed_at":"2024-03-07T03:51:12.106Z","dependency_job_id":null,"html_url":"https://github.com/AssemblyAI-Solutions/translation-benchmarks","commit_stats":null,"previous_names":["assemblyai-solutions/translation-benchmarks"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI-Solutions%2Ftranslation-benchmarks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI-Solutions%2Ftranslation-benchmarks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI-Solutions%2Ftranslation-benchmarks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI-Solutions%2Ftranslation-benchmarks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AssemblyAI-Solutions","download_url":"https://codeload.github.com/AssemblyAI-Solutions/translation-benchmarks/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244076055,"owners_count":20394048,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-24T04:53:41.513Z","updated_at":"2025-03-17T17:25:14.067Z","avatar_url":"https://github.com/AssemblyAI-Solutions.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Translation Benchmarks\n\nThis project benchmarks different translation services on the CoVoST dataset. The goal is to compare the quality of translations provided by different vendors along with average latency. The quality is measured using BLEU scores.\n\nCurrent vendors include:\n- [Google Cloud Translation](https://cloud.google.com/translate/docs/reference/libraries/v2/python)\n- [DeepL](https://www.deepl.com/en/docs-api)\n- [OpenAI Whisper (STT + Translation)](https://openai.com/docs/)\n- [Gladia (STT + Translation)](https://docs.gladia.io/reference/)\n- [Python `translate` library](https://github.com/terryyin/translate-python) using [MyMemory](https://mymemory.translated.net/) API\n- [Meta's m2m100-1.2b](https://about.fb.com/news/2020/10/first-multilingual-machine-translation-model/) (self-hosted on [Cloudflare Workers](https://developers.cloudflare.com/workers-ai/models/translation/))\n\nFollow the steps below to run the benchmarks:\n\n## Setup\n\n1. **Download Common Voice and generate splits**: Follow the steps provided [here](https://github.com/facebookresearch/covost?tab=readme-ov-file#covost-2).\n\n2. **Install the requirements**: Run `pip install -r requirements.txt` in your terminal.\n\n3. **Add .env file with the following variables**: Make sure you have the API keys for the services you want to benchmark. The .env file should contain:\n\n    - `DEEPL`: DeepL API key\n    - `ASSEMBLY`: AssemblyAI API key\n    - `GOOGLE_APPLICATION_CREDENTIALS`: Path to Google Cloud credentials\n    - `OPENAI`: OpenAI API key\n    - `GLADIA`: Gladia API key\n    - `COMMONVOICE_DIR`: Path to Common Voice dataset\n\n## Configuration\n\n1. Update `VENDORS` in `main.py`: Add the list of vendors you want to use.\n\n2. Update `FROM_LANG` and `TO_LANG` in `main.py`: Specify the language pair you want to use.\n\n## Running the Benchmarks\n\n1. **Generate translations**: Run `python main.py` in your terminal. This will generate translations for each vendor.\n\n2. **Calculate BLEU scores**: Run `python calculations.py` in your terminal. This will calculate BLEU scores for each vendor.\n\n## Results\n\nThe outputs will be saved in the `outputs` folder. There will be a CSV file for each vendor.\n\n## More info on CoVoST\n[CoVoST](https://github.com/facebookresearch/covost), a large-scale multilingual ST corpus based on Common Voice, to foster ST research with the largest ever open dataset. Its latest version covers translations from English into 15 languages---Arabic, Catalan, Welsh, German, Estonian, Persian, Indonesian, Japanese, Latvian, Mongolian, Slovenian, Swedish, Tamil, Turkish, Chinese---and from 21 languages into English, including the 15 target languages as well as Spanish, French, Italian, Dutch, Portuguese, Russian. It has total 2,880 hours of speech and is diversified with 78K speakers.\n\n## More info on BLEU scores\nBLEU measures the similarity between the machine-generated translation and the reference translations by comparing the presence and frequency of sequences of words (n-grams). \n\nA higher BLEU score indicates that the machine translation is closer to the human translations, implying better quality.\n\nIt doesn't account for the meaning of the text or grammatical correctness in a broader sense.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fassemblyai-solutions%2Ftranslation-benchmarks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fassemblyai-solutions%2Ftranslation-benchmarks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fassemblyai-solutions%2Ftranslation-benchmarks/lists"}