{"id":18169669,"url":"https://github.com/mozilla/translations","last_synced_at":"2025-04-12T23:37:15.792Z","repository":{"id":37078856,"uuid":"363274616","full_name":"mozilla/translations","owner":"mozilla","description":"The code, training pipeline, and models that power Firefox Translations","archived":false,"fork":false,"pushed_at":"2025-04-12T00:40:26.000Z","size":13869,"stargazers_count":189,"open_issues_count":239,"forks_count":36,"subscribers_count":17,"default_branch":"main","last_synced_at":"2025-04-12T23:36:31.174Z","etag":null,"topics":["machine-learning","machine-translation","ml","neural-machine-translation"],"latest_commit_sha":null,"homepage":"https://mozilla.github.io/translations/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mozilla.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-30T22:36:48.000Z","updated_at":"2025-04-11T19:54:52.000Z","dependencies_parsed_at":"2024-11-01T17:18:42.980Z","dependency_job_id":"6e9a59c1-a688-40e9-9a9f-dde818f26f92","html_url":"https://github.com/mozilla/translations","commit_stats":null,"previous_names":["mozilla/translations"],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mozilla%2Ftranslations","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mozilla%2Ftranslations/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mozilla%2Ftranslations/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mozilla%2Ftranslations/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mozilla","download_url":"https://codeload.github.com/mozilla/translations/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248647257,"owners_count":21139081,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","machine-translation","ml","neural-machine-translation"],"created_at":"2024-11-02T14:05:15.159Z","updated_at":"2025-04-12T23:37:15.540Z","avatar_url":"https://github.com/mozilla.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Firefox Translations\n\nTraining pipelines and the inference engine for Firefox Translations machine translation models.\n\nThe trained models are hosted in [firefox-translations-models](https://github.com/mozilla/firefox-translations-models/) repository,\ncompatible with [bergamot-translator](https://github.com/mozilla/bergamot-translator) and\npower the Firefox web page translation starting with version 118.\n\nThe pipeline was originally developed as a part of [Bergamot](https://browser.mt/) project  that focuses on improving client-side machine translation in a web browser.\n\n[Documentation](https://mozilla.github.io/translations/)\n\n## Pipeline\n\nThe pipeline is capable of training a translation model for a language pair end to end.\nTranslation quality depends on the chosen datasets, data cleaning procedures and hyperparameters.\nSome settings, especially low resource languages might require extra tuning.\n\nWe use fast translation engine [Marian](https://marian-nmt.github.io).\n\nYou can find more details about the pipeline steps in the [documentation](docs/training/pipeline-steps.md).\n\n## Orchestrators\n\nAn orchestrator is responsible for workflow management and parallelization.\n\n- [Taskcluster](https://taskcluster.net/) - Mozilla task execution framework. It is also used for Firefox CI.\n  It provides access to the hybrid cloud workers (GCP + on-prem) with increased scalability and observability.\n  [Usage instructions](docs/training/task-cluster.md).\n- [Snakemake](https://snakemake.github.io/) - a file based orchestrator that allows to run the pipeline locally or on a Slurm cluster.\n  [Usage instructions](docs/training/snakemake.md). (The integration is not maintained since Mozilla has switched to Taskcluster. Contributions are welcome.)\n\n## Experiment tracking\n\n[Public training dashboard in Weights \u0026 Biases](https://wandb.ai/moz-translations/projects)\n\nMarian training metrics are parsed from logs and published using a custom module within the `tracking` directory.\nMore information is available [here](docs/training/tracking.md).\n\n## Contributing\n\nContributions are welcome! See the [documentation on Contributing](docs/contributing/index.md) for more details.\n\nFeel free to ask questions in our Matrix channel [#firefoxtranslations:mozilla.org](https://matrix.to/#/#firefoxtranslations:mozilla.org).\n\n## Useful Links\n\n- [Reference papers](docs/README.md#references)\n- [Model training guide](docs/training/README.md) - practical advice on how to use the pipeline\n- [High level overview post on Mozilla Hacks](https://hacks.mozilla.org/2022/06/training-efficient-neural-network-models-for-firefox-translations/)\n- [The Training Pipeline DAG](https://docs.google.com/presentation/d/1HkypImI_hbA3n1ljU57ZPAzW8PuQqdv2wrXqj688KtQ/edit?slide=id.g3421e8f521e_1_419#slide=id.g3421e8f521e_1_419)\n- [Lightning Talk on the Training Pipeline Overview](https://www.youtube.com/watch?v=TfDEAYCeF6s)\n- [Training and Experiment Dashboard](https://docs.google.com/spreadsheets/d/1Kiz9xUjo2jpeeVGtaL3jA_cLiCiiyz8GvIoQADMyYqo/edit?gid=0#gid=0)\n- [moz-fx-translations-data--303e-prod-translations-data](https://console.cloud.google.com/storage/browser/moz-fx-translations-data--303e-prod-translations-data) - Uploaded models\n- [Models in released to Firefox](https://mozilla.github.io/translations/firefox-models/)\n- [Documentation of the Firefox integration](https://firefox-source-docs.mozilla.org/toolkit/components/translations/index.html)\n\n## Acknowledgements\n\nThis project uses materials developed by:\n\n- Bergamot project ([github](https://github.com/browsermt), [website](https://browser.mt/)) that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825303\n- HPLT project ([github](https://github.com/hplt-project), [website](https://hplt-project.org/)) that has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101070350 and from UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee [grant number 10052546]\n- OPUS-MT project ([github](https://github.com/Helsinki-NLP/Opus-MT), [website](https://opus.nlpl.eu/))\n- Many other open source projects and research papers (see [References](docs/README.md#references))\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmozilla%2Ftranslations","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmozilla%2Ftranslations","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmozilla%2Ftranslations/lists"}