{"id":18717362,"url":"https://github.com/rekola/translator","last_synced_at":"2025-08-31T18:32:26.147Z","repository":{"id":206795123,"uuid":"543006895","full_name":"rekola/translator","owner":"rekola","description":"Machine Translation Microservice with REST API","archived":false,"fork":false,"pushed_at":"2024-01-01T16:43:06.000Z","size":73,"stargazers_count":4,"open_issues_count":6,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-07T13:15:50.883Z","etag":null,"topics":["api","c-plus-plus","cpp","machine-translation","rest-api"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rekola.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-09-29T08:32:20.000Z","updated_at":"2024-09-13T08:23:52.000Z","dependencies_parsed_at":"2024-01-01T17:45:40.696Z","dependency_job_id":null,"html_url":"https://github.com/rekola/translator","commit_stats":null,"previous_names":["rekola/translator"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Ftranslator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Ftranslator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Ftranslator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Ftranslator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rekola","download_url":"https://codeload.github.com/rekola/translator/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":231615479,"owners_count":18400980,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","c-plus-plus","cpp","machine-translation","rest-api"],"created_at":"2024-11-07T13:15:54.245Z","updated_at":"2024-12-28T10:38:10.588Z","avatar_url":"https://github.com/rekola.png","language":"C++","readme":"# translator\n\n[![CI](https://github.com/rekola/translator/workflows/Ubuntu-CI/badge.svg)]()\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)\n\nMachine Translation Service based on MarianNMT. The project is a\nmicroservice that contains a web server that provides a REST based\nHTTP API for machine translation. FastText is used for input language\ndetection. This is a work in progress. The software is not yet\nmulti-threaded, so translation tasks block the server for their\nduration.\n\n## Dependencies\n\n- MarianNMT\n- OpenBLAS\n- {fmt}\n- fastText\n- cpp-httplib\n- utf8proc\n- nlohmann::json\n\n## Assets\n\nLanguage assets are hardcoded in the program. You must ensure that the needed assets are present in the assets subdirectory.\n\n### fastText model for language detection\n\nPlace lid.176.ftz in assets directory (download link: https://fasttext.cc/docs/en/language-identification.html)\n\n### Hardcoded language models\n\nPlace each language model in the assets directory in $SOURCELANG/$TARGETLANG subdirectory. New models can be downloaded from https://github.com/Helsinki-NLP/Opus-MT-train/tree/master/models and must be added to MarianTranslator.cpp and TranslationContext.cpp\n\n| Source | Target | Model | Bleu (Tatoeba) |\n| - | - | - | - |\n| en | fi | [opus+bt-2020-02-26.zip](https://object.pouta.csc.fi/OPUS-MT-models/en-fi/opus+bt-2020-02-26.zip) | 41.4 |\n| fi | en | [opus-2020-02-13.zip](https://object.pouta.csc.fi/OPUS-MT-models/fi-en/opus-2020-02-13.zip) | 57.4 |\n| fi | ru | [opus-2020-04-12.zip](https://object.pouta.csc.fi/OPUS-MT-models/fi-ru/opus-2020-04-12.zip) | 46.3 |\n| sv | en | [opus-2020-02-26.zip](https://object.pouta.csc.fi/OPUS-MT-models/sv-en/opus-2020-02-26.zip) | 64.5 |\n| et | en | [opus-2019-12-18.zip](https://object.pouta.csc.fi/OPUS-MT-models/et-en/opus-2019-12-18.zip) | 59.9 |\n| ru | en | [opus-2020-02-26.zip](https://object.pouta.csc.fi/OPUS-MT-models/ru-en/opus-2020-02-26.zip) | 61.1 |\n| de | en | [opus-2020-02-26.zip](https://object.pouta.csc.fi/OPUS-MT-models/de-en/opus-2020-02-26.zip) | 55.4 |\n| uk | en | [opus-2020-01-16.zip](https://object.pouta.csc.fi/OPUS-MT-models/uk-en/opus-2020-01-16.zip) | 64.1 |\n\n#### BLEU Score Interpretation\n\n| BLEU Score | Interpretation                                            |\n|------------|-----------------------------------------------------------|\n| \u003c 10       | Almost useless                                            |\n| 10 - 19    | Hard to get the gist                                      |\n| 20 - 29    | The gist is clear, but has significant grammatical errors |\n| 30 - 40    | Understandable to good translations                       |\n| 40 - 50    | High quality translations                                 |\n| 50 - 60    | Very high quality, adequate, and fluent translations      |\n| \u003e 60       | Quality often better than human                           |\n\n\n## Installation (Ubuntu)\n\n### Install dependencies\n\n```\nsudo apt install libutf8proc-dev libopenblas-dev libfmt-dev libfasttext-dev\n```\n\n### Compilation (CPU)\n\n```\nmkdir build\ncd build\ncmake ../ -DCOMPILE_CPU=on\n```\n\n### Compilation (GPU)\n\n```\nmkdir build\ncd build\ncmake ../ -DCOMPILE_CUDA=on\n```\n\n## Running\n\n```\n./translator\n```\n\nAccess the url ```http://localhost:8080/translate?q=Hello%20world\u0026target=fi``` to test.\n\n## API\n\n### Endpoints\n\n| Path | Description |\n| - | - |\n| /translate | Translates input from source to target language |\n\n### /translate\n\n| Parameter | Required | Description |\n| - | - | - |\n| source | No | Source language. If missing, the language is autodetected |\n| target | Yes | Target language (e.g. en) |\n| q | Yes | Input text. Can be used multiple times. |\n| format | No | Output format. Not used. |\n\n#### Response\n\n```json\n{\"data\":{\"translations\":[{\"detectedSourceLanguage\":\"en\",\"translatedText\":\" Hei maailma\"}]}}\n```\n\n## Known issues\n\n- Translation is done one sentence at a time, which leads to suboptimal translations\n- Duplicate parameters are ignored, which means that you cannot translate the same text multiple times.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frekola%2Ftranslator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frekola%2Ftranslator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frekola%2Ftranslator/lists"}