Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/facebookresearch/flores
Facebook Low Resource (FLoRes) MT Benchmark
https://github.com/facebookresearch/flores
Last synced: 28 days ago
JSON representation
Facebook Low Resource (FLoRes) MT Benchmark
- Host: GitHub
- URL: https://github.com/facebookresearch/flores
- Owner: facebookresearch
- License: other
- Archived: true
- Created: 2019-02-01T02:18:45.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2023-11-20T21:03:52.000Z (about 1 year ago)
- Last Synced: 2024-08-04T01:17:12.498Z (4 months ago)
- Language: Python
- Homepage:
- Size: 10.7 MB
- Stars: 679
- Watchers: 67
- Forks: 123
- Open Issues: 22
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE_CC-BY-NC4.0
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-kurdish - FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
README
# FLORES-200 and NLLB Professionally Translated Datasets: NLLB-Seed, NLLB-MD, and Toxicity-200
⚠️ This repository is no longer being updated ⚠️
**Newer versions** of the FLORES and NLLB-Seed datasets managed by the [Open Language Data Initiative](https://www.oldi.org/) are available here:
* [FLORES](https://github.com/openlanguagedata/flores)
* [NLLB-Seed](https://github.com/openlanguagedata/seed)Quick-access to the original READMEs:
* [FLORES-200](flores200/README.md)
* [NLLB-Seed](nllb_seed/README.md)
* [NLLB-MD](nllb_md/README.md)
* [Toxicity-200](toxicity/README.md)## Citation
If you use any of this data in your work, please cite:
```bibtex
@article{nllb2022,
author = {NLLB Team, Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Jeff Wang},
title = {No Language Left Behind: Scaling Human-Centered Machine Translation},
year = {2022}
}
```## Changelog
- 2022-06-30: Released FLORES-200, NLLB-Seed, NLLB-MD, and Toxicity-200- 2021-06-04: Released FLORES-101
## Licenses
* FLORES-200: CC-BY-SA 4.0
* NLLB-SEED: CC-BY-SA 4.0
* NLLB-MD: CC-BY-NC 4.0
* Toxicity-200: CC-BY-SA 4.0