{"id":20807484,"url":"https://github.com/tikquuss/meta_xlm","last_synced_at":"2025-07-12T01:39:39.513Z","repository":{"id":76992723,"uuid":"262747448","full_name":"Tikquuss/meta_XLM","owner":"Tikquuss","description":"Cross-lingual Language Model (XLM) pretraining and Model-Agnostic Meta-Learning (MAML) for fast adaptation of deep networks","archived":false,"fork":false,"pushed_at":"2021-03-26T04:39:50.000Z","size":34300,"stargazers_count":20,"open_issues_count":0,"forks_count":4,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-31T06:51:10.550Z","etag":null,"topics":["african-languages","back-translation","bert","bleu-scores","bpe-codes","clm","denoising-autoencoders","languages","machine-translation","maml","meta-model","mlm","parallel-training","pretrained-models","text-clustering","text-processing","tlm","xlm"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Tikquuss.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-10T08:55:34.000Z","updated_at":"2024-02-12T09:54:10.000Z","dependencies_parsed_at":null,"dependency_job_id":"d1235b84-e3a2-4cb0-ad0d-0a7a8cc6a5ae","html_url":"https://github.com/Tikquuss/meta_XLM","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tikquuss%2Fmeta_XLM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tikquuss%2Fmeta_XLM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tikquuss%2Fmeta_XLM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tikquuss%2Fmeta_XLM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Tikquuss","download_url":"https://codeload.github.com/Tikquuss/meta_XLM/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252823668,"owners_count":21809707,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["african-languages","back-translation","bert","bleu-scores","bpe-codes","clm","denoising-autoencoders","languages","machine-translation","maml","meta-model","mlm","parallel-training","pretrained-models","text-clustering","text-processing","tlm","xlm"],"created_at":"2024-11-17T19:38:08.781Z","updated_at":"2025-05-07T05:42:16.627Z","avatar_url":"https://github.com/Tikquuss.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"```\n@misc{\npascal2021on,\ntitle={On the use of linguistic similarities to improve Neural Machine Translation for African Languages},\nauthor={Tikeng Notsawo Pascal and NANDA ASSOBJIO Brice Yvan and James Assiene},\nyear={2021},\nurl={https://openreview.net/forum?id=Q5ZxoD2LqcI}\n}\n```\n\n## I. Cross-lingual language model pretraining ([XLM](https://github.com/facebookresearch/XLM)) \n\nXLM supports multi-GPU and multi-node training, and contains code for:\n- **Language model pretraining**:\n    - **Causal Language Model** (CLM)\n    - **Masked Language Model** (MLM)\n    - **Translation Language Model** (TLM)\n- **GLUE** fine-tuning\n- **XNLI** fine-tuning\n- **Supervised / Unsupervised MT** training:\n    - Denoising auto-encoder\n    - Parallel data training\n    - Online back-translation\n\n#### Dependencies\n\n- Python 3\n- [NumPy](http://www.numpy.org/)\n- [PyTorch](http://pytorch.org/) (currently tested on version 0.4 and 1.0)\n- [fastBPE](https://github.com/facebookresearch/XLM/tree/master/tools#fastbpe) (generate and apply BPE codes)\n- [Moses](https://github.com/facebookresearch/XLM/tree/master/tools#tokenizers) (scripts to clean and tokenize text only - no installation required)\n- [Apex](https://github.com/nvidia/apex#quick-start) (for fp16 training)\n\n### Pretrained models  \n\u003ctable class=\"table table-striped\"\u003e\n    \u003ccaption\u003e\u003cb\u003eMachine Translation BLEU scores. The rows correspond to the pairs of interest on which\nBLEU scores are reported. The column None is a baseline : it represents the BLEU score of a\nmodel trained on the pair without any MLM or TLM pre-training. The column Pair is a baseline :\nit represents the BLEU score of a model trained on the pair with MLM and TLM pre-training. The\ncolumn Random is also a baseline : it is the BLEU score of a 3 languages multi-task model where\nthe language added was chosen purely at random. The column Historical refers to the BLEU score\nof our 3 languages multi-task model where the language added was chosen using clusters historicaly identified. The column LM describes the BLEU score of our 3 languages, multi-task model where the\nlanguage added was chosen using the LM similarity\u003c/b\u003e\u003c/caption\u003e\n    \u003cthead\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"col\"\u003ePretraining\u003c/th\u003e\n            \u003cth scope=\"col\"\u003eNone\u003c/th\u003e\n            \u003cth scope=\"col\"\u003ePair\u003c/th\u003e\n            \u003cth scope=\"col\"\u003eRandom\u003c/th\u003e\n            \u003cth scope=\"col\"\u003eHistorical\u003c/th\u003e\n            \u003cth scope=\"col\"\u003eLM\u003c/th\u003e\n        \u003c/tr\u003e\n    \u003c/thead\u003e\n    \u003ctbody\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eBafia-Bulu\u003c/th\u003e\n            \u003ctd\u003e09.19\u003c/td\u003e\n            \u003ctd\u003e12.58\u003c/td\u003e\n            \u003ctd\u003e23.52\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e28.81\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003e13.03\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eBulu-Bafia\u003c/th\u003e\n            \u003ctd\u003e13.50\u003c/td\u003e\n            \u003ctd\u003e15.15\u003c/td\u003e\n            \u003ctd\u003e24.76\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e32.83\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003e13.91\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eBafia-Ewondo\u003c/th\u003e\n            \u003ctd\u003e09.30\u003c/td\u003e\n            \u003ctd\u003e11.28\u003c/td\u003e\n            \u003ctd\u003e08.28\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e38.90\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e38.90\u003c/b\u003e\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eEwondo-Bafia\u003c/th\u003e\n            \u003ctd\u003e13.99\u003c/td\u003e\n            \u003ctd\u003e16.07\u003c/td\u003e\n            \u003ctd\u003e10.26\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e35.84\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e35.84\u003c/b\u003e\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eBulu-Ewondo\u003c/th\u003e\n            \u003ctd\u003e10.27\u003c/td\u003e\n            \u003ctd\u003e12.11\u003c/td\u003e\n            \u003ctd\u003e11.82\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e39.12\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003e34.86\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eEwondo-Bulu\u003c/th\u003e\n            \u003ctd\u003e11.62\u003c/td\u003e\n            \u003ctd\u003e14.42\u003c/td\u003e\n            \u003ctd\u003e12.27\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e34.91\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003e30.98\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eGuidar-Guiziga\u003c/th\u003e\n            \u003ctd\u003e11.95\u003c/td\u003e\n            \u003ctd\u003e15.05\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003eHistorical\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eGuiziga-Guidar\u003c/th\u003e\n            \u003ctd\u003e08.05\u003c/td\u003e\n            \u003ctd\u003e08.94\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003eHistorical\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eGuiziga-Mofa\u003c/th\u003e\n            \u003ctd\u003e17.78\u003c/td\u003e\n            \u003ctd\u003e21.67\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003eHistorical\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eMofa-Guiziga\u003c/th\u003e\n            \u003ctd\u003e12.02\u003c/td\u003e\n            \u003ctd\u003e15.41\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003eHistorical\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eGuidar-Kapsiki\u003c/th\u003e\n            \u003ctd\u003e14.74\u003c/td\u003e\n            \u003ctd\u003e17.78\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003eHistorical\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eKapsiki-Guidar\u003c/th\u003e\n            \u003ctd\u003e08.63\u003c/td\u003e\n            \u003ctd\u003e09.33\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003eHistorical\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eFrench-Bulu\u003c/th\u003e\n            \u003ctd\u003e19.91\u003c/td\u003e\n            \u003ctd\u003e23.47\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e25.06\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eBulu-French\u003c/th\u003e\n            \u003ctd\u003e17.49\u003c/td\u003e\n            \u003ctd\u003e22.44\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e23.68\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eFrench-Bafia\u003c/th\u003e\n            \u003ctd\u003e14.48\u003c/td\u003e\n            \u003ctd\u003e15.35\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e30.65\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eBafia-French\u003c/th\u003e\n            \u003ctd\u003e08.59\u003c/td\u003e\n            \u003ctd\u003e11.17\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e24.49\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eFrench-Ewondo\u003c/th\u003e\n            \u003ctd\u003e11.51\u003c/td\u003e\n            \u003ctd\u003e13.93\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e35.50\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n            \u003cth scope=\"row\"\u003eEwondo-French\u003c/th\u003e\n            \u003ctd\u003e10.60\u003c/td\u003e\n            \u003ctd\u003e13.77\u003c/td\u003e\n            \u003ctd\u003eRandom\u003c/td\u003e\n            \u003ctd\u003e\u003cb\u003e27.34\u003c/b\u003e\u003c/td\u003e\n            \u003ctd\u003eLM\u003c/td\u003e\n        \u003c/tr\u003e\n    \u003c/tbody\u003e\n\u003c/table\u003e\n\n## II. Model-Agnostic Meta-Learning ([MAML](https://arxiv.org/abs/1911.02116))  \n\nSee [maml](https://github.com/cbfinn/maml), [learn2learn](https://github.com/learnables/learn2learn)...  \n\nSee [HowToTrainYourMAMLPytorch](https://github.com/AntreasAntoniou/HowToTrainYourMAMLPytorch) for a replication of the paper [\"How to train your MAML\"](https://arxiv.org/abs/1810.09502), along with a replication of the original [\"Model Agnostic Meta Learning\"](https://arxiv.org/abs/1703.03400) (MAML) paper.\n\n## III. Train your own (meta-)model\n\n**Open the illustrative notebook in colab**[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Tikquuss/meta_XLM/blob/master/notebooks/demo/tuto.ipynb)\n\n**Note** : Most of the bash scripts used in this repository were written on the windows operating system, and can generate this [error](https://prograide.com/pregunta/5588/configure--bin--sh--m-mauvais-interpreteur) on linux platforms.  \nThis problem can be corrected with the following command: \n```\nfilename=my_file.sh \ncat $filename | tr -d '\\r' \u003e $filename.new \u0026\u0026 rm $filename \u0026\u0026 mv $filename.new $filename \n```\n### 1. Preparing the data \n\nAt this level, if you have pre-processed binary data in pth format (for example from XLM experimentation or improvised by yourself), group them in a specific folder that you will mention as a parameter by calling the script [train.py](XLM/train.py).  \nIf this is not the case, we assume that you have txt files available for preprocessing. Look at the following example for which we have three translation tasks: `English-French, German-English and German-French`.\n\nWe have the following files available for preprocessing: \n```\n- en-fr.en.txt and en-fr.fr.txt \n- de-en.de.txt and de-en.en.txt \n- de-fr.de.txt and de-fr.fr.txt \n```\nAll these files must be in the same folder (`PARA_PATH`).  \nYou can also (only or optionally) have monolingual data available (`en.txt, de.txt and fr.txt`; in `MONO_PATH` folder).  \nParallel and monolingual data can all be in the same folder.\n\n**Note** : Languages must be submitted in alphabetical order (`de-en and not en-de, fr-ru and not ru-fr...`). If you submit them in any order you will have problems loading data during training, because when you run the [train.py](XLM/train.py) script the parameters like the language pair are put back in alphabetical order before being processed. Don't worry about this alphabetical order restriction, XLM for MT is naturally trained to translate sentences in both directions. See [translate.py](scripts/translate.py).\n\n[OPUS collections](http://opus.nlpl.eu/) is a good source of dataset. We illustrate in the [opus.sh](scripts/opus.sh) script how to download the data from opus and convert it to a text file.  \nChanging parameters ($PARA_PATH and $SRC) in [opus.sh](scripts/opus.sh).\n```\ncd meta_XLM\nchmod +x ./scripts/opus.sh\n./scripts/opus.sh de-fr\n```\n\nAnother source for `other_languages-english` data is [anki Tab-delimited Bilingual Sentence Pairs](http://www.manythings.org/anki/). Simply download the .zip file, unzip to extract the `other_language.txt` file. This file usually contains data in the form of `sentence_en sentence_other_language other_information` on each line. See [anki.py](scripts/anki.py) and [anky.sh](scripts/anki.sh) in relation to a how to extract data from [anki](http://www.manythings.org/anki/). Example of how to download and extract `de-en` and `en-fr` pair data.\n```\ncd meta_XLM\noutput_path=/content/data/para\nmkdir $output_path\nchmod +x ./scripts/anki.sh\n./scripts/anki.sh de,en deu-eng $output_path scripts/anki.py\n./scripts/anki.sh en,fr fra-eng $output_path scripts/anki.py\n```\nAfter that you will have in `data/para` following files : `de-en.de.txt, de-en.en.txt, deu.txt, deu-eng.zip and _about.txt`  \n\nMove to the `XLM` folder in advance.  \n```\ncd XLM\n```\nInstall the following dependencies ([fastBPE](https://github.com/facebookresearch/XLM/tree/master/tools#fastbpe) and [Moses](https://github.com/facebookresearch/XLM/tree/master/tools#tokenizers)) if you have not already done so. \n```\ngit clone https://github.com/moses-smt/mosesdecoder tools/mosesdecoder\ngit clone https://github.com/glample/fastBPE tools/fastBPE \u0026\u0026 cd tools/fastBPE \u0026\u0026 g++ -std=c++11 -pthread -O3 fastBPE/main.cc -IfastBPE -o fast\n```\n  \nChanging parameters in [data.sh](data.sh). Between lines 94 and 100 of [data.sh](data.sh), you have two options corresponding to two scripts to execute according to the distribution of the folders containing your data. Option 2 is chosen by default, kindly uncomment the lines corresponding to your option.  \nWith too many BPE codes (depending on the size of the dataset) you may get this [error](https://github.com/glample/fastBPE/issues/7). Decrease the number of codes (e.g. you can dichotomously search for the appropriate/maximum number of codes that make the error disappear)\n\n```\nlanguages=de,en,fr\nchmod +x ../data.sh \n../data.sh $languages\n```\n\nIf you stop the execution when processing is being done on a file please delete this erroneous file before continuing or restarting the processing, otherwise the processing will continue with this erroneous file and potential errors will certainly occur.  \n\nAfter this you will have the following (necessary) files in `$OUTPATH` (and `$OUTPATH/fine_tune` depending on the parameter `$sub_task`):  \n\n```\n- monolingual data :\n    - training data   : train.fr.pth, train.en.pth and train.de.pth\n    - test data       : test.fr.pth, test.en.pth and test.de.pth\n    - validation data : valid.fr.pth, valid.en.pth and valid.de.pth\n- parallel data :\n    - training data : \n        - train.en-fr.en.pth and train.en-fr.fr.pth \n        - train.de-en.en.pth and train.de-en.de.pth\n        - train.de-fr.de.pth and train.de-fr.fr.pth \n    - test data :\n        - test.en-fr.en.pth and test.en-fr.fr.pth \n        - test.de-en.en.pth and test.de-en.de.pth\n        - test.de-fr.de.pth and test.de-fr.fr.pth \n    - validation data\n        - valid.en-fr.en.pth and valid.en-fr.fr.pth \n        - valid.de-en.en.pth and valid.de-en.de.pth\n        - valid.de-fr.de.pth and valid.de-fr.fr.pth \n - code and vocab\n```\nTo use the biblical corpus, run [bible.sh](bible.sh) instead of [data.sh](data.sh). Here is the list of languages available (and to be specified as `$languages` value) in this case : \n- **Languages with data in the New and Old Testament** : `Francais, Anglais, Fulfulde_Adamaoua or Fulfulde_DC (formal name : Fulfulde), Bulu, KALATA_KO_SC_Gbaya or KALATA_KO_DC_Gbaya (formal name :  Gbaya), BIBALDA_TA_PELDETTA (formal name : MASSANA), Guiziga, Kapsiki_DC (formal name : Kapsiki), Tupurri`.\n- **Languages with data in the New Testament only** : `Bafia, Ejagham, Ghomala, MKPAMAN_AMVOE_Ewondo (formal name : Ewondo), Ngiemboon, Dii, Vute, Limbum, Mofa, Mofu_Gudur, Doyayo, Guidar, Peere_Nt\u0026Psalms, Samba_Leko, Du_na_sdik_na_wiini_Alaw`.  \nIt is specified in [bible.sh](bible.sh) that you must have in `csv_path` a folder named csvs. Here is the [drive link](https://drive.google.com/file/d/1NuSJ-NT_BsU1qopLu6avq6SzUEf6nVkk/view?usp=sharing) of its zipped version.  \nConcerning training, specify the first four letters of each language (`Bafi` instead of `Bafia` for example), except `KALATA_KO_SC_Gbaya/KALATA_KO_DC_Gbaya which becomes Gbay (first letters of Gbaya), BIBALDA_TA_PELDETTA which becomes MASS (first letters of MASSANA), MKPAMAN_AMVOE_Ewondo which becomes Ewon (first letters of Ewondo), Francais and Anglais which becomes repectively fr and en`. Indeed, [bible.sh](bible.sh) uses these abbreviations to create the files and not the language names themselves.  \nOne last thing in the case of the biblical corpus is that when only one language is to be specified, it must be specified twice. For example: `languages=Bafia,Bafia` instead of `languages=Bafia`.\n\n### 2. Pretrain a language (meta-)model \n\nInstall the following dependencie ([Apex](https://github.com/nvidia/apex#quick-start)) if you have not already done so.\n```\ngit clone https://github.com/NVIDIA/apex\npip install -v --no-cache-dir --global-option=\"--cpp_ext\" --global-option=\"--cuda_ext\" ./apex\n```\n\nInstead of passing all the parameters of train.py, put them in a json file and specify the path to this file in parameter (See [lm_template.json](configs/lm_template.json) file for more details).\n```\nconfig_file=../configs/lm_template.json\npython train.py --config_file $config_file\n```\nIf you pass a parameter by calling the script [train.py](XLM/train.py) (example: `python train.py --config_file $config_file --data_path my/data_path`), it will overwrite the one passed in `$config_file`.  \nOnce the training is finished you will see a file named `train.log` in the `$dump_path/$exp_name/$exp_id` folder information about the training. You will find in this same folder your checkpoints and best model.  \nWhen `\"mlm_steps\":\"...\"`, train.py automatically uses the languages to have `\"mlm_steps\":\"de,en,fr,de-en,de-fe,en-fr\"` (give a precise value to mlm_steps if you don't want to do all MLM and TLM, example : `\"mlm_steps\":\"en,fr,en-fr\"`). This also applies to `\"clm_steps\":\"...\"` which deviates `\"clm_steps\":\"de,en,fr\"` in this case.    \n\nNote :  \n-`en` means MLM on `en`, and requires the following three files in `data_path`: `a.en.pth, a ∈ {train, test, valid} (monolingual data)`  \n-`en-fr` means TLM on `en and fr`, and requires the following six files in `data_path`: `a.en-fr.b.pth, a ∈ {train, test, valid} and b ∈ {en, fr} (parallel data)`  \n-`en,fr,en-fr` means MLM+TLM on `en, fr, en and fr`, and requires the following twelve files in `data_path`: `a.b.pth and a.en-fr.b.pth, a ∈ {train, test, valid} and b ∈ {en, fr}`  \n\nTo [train with multiple GPUs](https://github.com/facebookresearch/XLM#how-can-i-run-experiments-on-multiple-gpus) use:\n```\nexport NGPU=8; python -m torch.distributed.launch --nproc_per_node=$NGPU train.py --config_file $config_file\n```\n\n**Tips**: Even when the validation perplexity plateaus, keep training your model. The larger the batch size the better (so using multiple GPUs will improve performance). Tuning the learning rate (e.g. [0.0001, 0.0002]) should help.\n\nIn the case of \u003cb\u003emetalearning\u003c/b\u003e, you just have to specify your meta-task separated by `|` in `lgs` and `objectives (clm_steps, mlm_steps, ae_steps, mt_steps, bt_steps and pc_steps)`.  \nFor example, if you only want to do metalearning (without doing XLM) in our case, you have to specify these parameters: `\"lgs\":\"de-en|de-fr|en-fr\"`, `\"clm_steps\":\"...|...|...\"` and/or `\"mlm_steps\":\"...|...|...\"`. These last two parameters, if specified as such, will become respectively `\"clm_steps\":\"de,en|de,fr|en,fr\"` and/or `\"mlm_steps\":\"de,en,de-en|de,fr,de-fr|en,fr,en-fr\"`.  \nThe passage of the three points follows the same logic as above. That is to say that if at the level of the meta-task `de-en`:  \n\t- we only want to do MLM (without TLM): `mlm_steps` becomes `\"mlm_steps\": \"de,en|...|...\"`  \n\t- we don't want to do anything: `mlm_steps` becomes `\"mlm_steps\":\"|...|...\"`.\n\nIt is also not allowed to specify a meta-task that has no objective. In our case, `\"clm_steps\":\"...||...\"` and/or `\"mlm_steps\":\"...||...\"` will generate an exception, in which case the meta-task `de-fr` (second task) has no objective.\n\nIf you want to do metalearning and XLM simultaneously : \n- `\"lgs\":\"de-en-fr|de-en-fr|de-en-fr\"` \n- Follow the same logic as described above for the other parameters.\n\n###### Description of some essential parameters\n\n```\n## main parameters\nexp_name                     # experiment name\nexp_id                       # Experiment ID\ndump_path                    # where to store the experiment (the model will be stored in $dump_path/$exp_name/$exp_id)\n\n## data location / training objective\ndata_path                    # data location \nlgs                          # considered languages/meta-tasks\nclm_steps                    # CLM objective\nmlm_steps                    # MLM objective\n\n## transformer parameters\nemb_dim                      # embeddings / model dimension\nn_layers                     # number of layers\nn_heads                      # number of heads\ndropout                      # dropout\nattention_dropout            # attention dropout\ngelu_activation              # GELU instead of ReLU\n\n## optimization\nbatch_size                   # sequences per batch\nbptt                         # sequences length\noptimizer                    # optimizer\nepoch_size                   # number of sentences per epoch\nmax_epoch                    # Maximum epoch size\nvalidation_metrics           # validation metric (when to save the best model)\nstopping_criterion           # end experiment if stopping criterion does not improve\n\n## dataset\n#### These three parameters will always be rounded to an integer number of batches, so don't be surprised if you see different values than the ones provided.\ntrain_n_samples              # Just consider train_n_sample train data\nvalid_n_samples              # Just consider valid_n_sample validation data \ntest_n_samples               # Just consider test_n_sample test data for\n#### If you don't have enough RAM/GPU or swap memory, leave these three parameters to True, otherwise you may get an error like this when evaluating :\n###### RuntimeError: copy_if failed to synchronize: cudaErrorAssert: device-side assert triggered\nremove_long_sentences_train # remove long sentences in train dataset      \nremove_long_sentences_valid # remove long sentences in valid dataset  \nremove_long_sentences_test  # remove long sentences in test dataset  \n```\n\n###### There are other parameters that are not specified here (see [train.py](XLM/train.py))\n\n### 3. Train a (unsupervised/supervised) MT from a pretrained meta-model \n\nSee [mt_template.json](configs/mt_template.json) file for more details.\n```\nconfig_file=../configs/mt_template.json\npython train.py --config_file $config_file\n```\n\nWhen the `ae_steps` and `bt_steps` objects alone are specified, this is unsupervised machine translation, and only requires monolingual data. If the parallel data is available, give `mt_step` a value based on the language pairs for which the data is available.  \nAll comments made above about parameter passing and \u003cb\u003emetalearning\u003c/b\u003e remain valid here : if you want to exclude a meta-task in an objective, put a blank in its place. Suppose, in the case of \u003cb\u003emetalearning\u003c/b\u003e, we want to exclude from `\"ae_steps\":\"en,fr|en,de|de,fr\"` the meta-task:\n- `de-en` : `ae_steps`  becomes `\"ae_steps\":\"en,fr||de,fr\"` \n- `de-fr` : `ae_steps`  becomes `\"ae_steps\":\"en,fr|de,en|\"`  \n\n###### Description of some essential parameters  \nThe description made above remains valid here\n```\n## main parameters\nreload_model     # model to reload for encoder,decoder\n## data location / training objective\nae_steps          # denoising auto-encoder training steps\nbt_steps          # back-translation steps\nmt_steps          # parallel training steps\nword_shuffle      # noise for auto-encoding loss\nword_dropout      # noise for auto-encoding loss\nword_blank        # noise for auto-encoding loss\nlambda_ae         # scheduling on the auto-encoding coefficient\n\n## transformer parameters\nencoder_only      # use a decoder for MT\n\n## optimization\ntokens_per_batch  # use batches with a fixed number of words\neval_bleu         # also evaluate the BLEU score\n```\n###### There are other parameters that are not specified here (see [train.py](XLM/train.py))\n\n\n### 4. case of metalearning : optionally fine-tune the meta-model on a specific (sub) nmt (meta) task \n\nAt this point, if your fine-tuning data did not come from the previous pre-processing, you can just prepare your txt data and call the script build_meta_data.sh with the (sub) task in question. Since the codes and vocabulary must be preserved, we have prepared another script ([build_fine_tune_data.sh](scripts/build_fine_tune_data.sh)) in which we directly apply BPE tokenization on dataset and binarize everything using preprocess.py based on the codes and vocabulary of the meta-model. So we have to call this script for each subtask like this :\n\n```\nlanguages = \nchmod +x ../ft_data.sh\n../ft_data.sh $languages\n```\n\nAt this stage, restart the training as in the previous section with :\n- lgs=\"en-fr\"\n- reload_model = path to the folder where you stored the meta-model\n- `bt_steps'':\"...\"`, `ae_steps'':\"...\"` and/or `mt_steps'':\"...\"` (replace the three bullet points with your specific objectives if any)  \nYou can use one of the two previously trained meta-models: pre-formed meta-model (MLM, TLM) or meta-MT formed from the pre-formed meta-model. \n\n### 5. How to evaluate a language model trained on a language L on another language L'.\n\n###### Our\n\n\u003ctable class='table table-striped'\u003e\u003ccaption\u003e\u003cb\u003e?\u003c/b\u003e\u003c/caption\u003e\u003cthead\u003e\u003ctr\u003e\u003cth scope='col'\u003eEvaluated on (cols)---------\u003cbr/\u003eTrained on (rows)\u003c/th\u003e\u003cth scope='col'\u003eBafi\u003c/th\u003e\u003cth scope='col'\u003eBulu\u003c/th\u003e\u003cth scope='col'\u003eEwon\u003c/th\u003e\u003cth scope='col'\u003eGhom\u003c/th\u003e\u003cth scope='col'\u003eLimb\u003c/th\u003e\u003cth scope='col'\u003eNgie\u003c/th\u003e\u003cth scope='col'\u003eDii\u003c/th\u003e\u003cth scope='col'\u003eDoya\u003c/th\u003e\u003cth scope='col'\u003ePeer\u003c/th\u003e\u003cth scope='col'\u003eSamb\u003c/th\u003e\u003cth scope='col'\u003eGuid\u003c/th\u003e\u003cth scope='col'\u003eGuiz\u003c/th\u003e\u003cth scope='col'\u003eKaps\u003c/th\u003e\u003cth scope='col'\u003eMofa\u003c/th\u003e\u003cth scope='col'\u003eMofu\u003c/th\u003e\u003cth scope='col'\u003eDu_n\u003c/th\u003e\u003cth scope='col'\u003eEjag\u003c/th\u003e\u003cth scope='col'\u003eFulf\u003c/th\u003e\u003cth scope='col'\u003eGbay\u003c/th\u003e\u003cth scope='col'\u003eMASS\u003c/th\u003e\u003cth scope='col'\u003eTupu\u003c/th\u003e\u003cth scope='col'\u003eVute\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003cth scope='row'\u003eBafi\u003c/th\u003e\u003ctd\u003e15.155782/46.113990\u003c/td\u003e\u003ctd\u003e3522.435230/12.694301\u003c/td\u003e\u003ctd\u003e10532.574414/3.108808\u003c/td\u003e\u003ctd\u003e3414.970521/10.103627\u003c/td\u003e\u003ctd\u003e3662.233924/10.880829\u003c/td\u003e\u003ctd\u003e4476.028980/2.072539\u003c/td\u003e\u003ctd\u003e4594.588311/10.362694\u003c/td\u003e\u003ctd\u003e3840.575574/13.989637\u003c/td\u003e\u003ctd\u003e\u003cb\u003e3111.148085/13.212435\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e4210.511141/8.031088\u003c/td\u003e\u003ctd\u003e6607.939683/2.590674\u003c/td\u003e\u003ctd\u003e7506.246899/3.108808\u003c/td\u003e\u003ctd\u003e11121.594025/3.367876\u003c/td\u003e\u003ctd\u003e3122.591005/13.212435\u003c/td\u003e\u003ctd\u003e3183.283705/10.621762\u003c/td\u003e\u003ctd\u003e5504.065998/8.549223\u003c/td\u003e\u003ctd\u003e4127.620979/3.108808\u003c/td\u003e\u003ctd\u003e9107.779213/6.994819\u003c/td\u003e\u003ctd\u003e7440.762805/3.886010\u003c/td\u003e\u003ctd\u003e4916.778213/12.176166\u003c/td\u003e\u003ctd\u003e8239.932584/4.922280\u003c/td\u003e\u003ctd\u003e3192.590598/10.362694\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eBulu\u003c/th\u003e\u003ctd\u003e\u003cb\u003e577.711688/9.585492\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e18.602898/43.264249\u003c/td\u003e\u003ctd\u003e795.094593/17.357513\u003c/td\u003e\u003ctd\u003e589.636415/13.471503\u003c/td\u003e\u003ctd\u003e1482.709434/8.549223\u003c/td\u003e\u003ctd\u003e1113.122905/12.435233\u003c/td\u003e\u003ctd\u003e994.030274/11.658031\u003c/td\u003e\u003ctd\u003e820.063393/10.103627\u003c/td\u003e\u003ctd\u003e828.162228/11.658031\u003c/td\u003e\u003ctd\u003e1519.449874/3.367876\u003c/td\u003e\u003ctd\u003e1183.604483/9.326425\u003c/td\u003e\u003ctd\u003e671.542857/13.989637\u003c/td\u003e\u003ctd\u003e1427.515245/5.440415\u003c/td\u003e\u003ctd\u003e657.031222/13.212435\u003c/td\u003e\u003ctd\u003e1018.342338/6.217617\u003c/td\u003e\u003ctd\u003e602.305603/10.880829\u003c/td\u003e\u003ctd\u003e1066.765090/6.994819\u003c/td\u003e\u003ctd\u003e1349.669421/6.476684\u003c/td\u003e\u003ctd\u003e605.298410/13.989637\u003c/td\u003e\u003ctd\u003e1615.328636/5.699482\u003c/td\u003e\u003ctd\u003e2493.141092/8.290155\u003c/td\u003e\u003ctd\u003e699.009937/13.730570\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eEwon\u003c/th\u003e\u003ctd\u003e2930.433348/13.730570\u003c/td\u003e\u003ctd\u003e\u003cb\u003e784.556467/12.435233\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e439.343693/11.139896\u003c/td\u003e\u003ctd\u003e8576.270483/3.886010\u003c/td\u003e\u003ctd\u003e1408.305834/12.176166\u003c/td\u003e\u003ctd\u003e6329.517824/5.181347\u003c/td\u003e\u003ctd\u003e4374.527024/8.031088\u003c/td\u003e\u003ctd\u003e5703.222147/4.922280\u003c/td\u003e\u003ctd\u003e3226.438808/13.471503\u003c/td\u003e\u003ctd\u003e5147.417352/9.585492\u003c/td\u003e\u003ctd\u003e7383.547206/3.886010\u003c/td\u003e\u003ctd\u003e2049.974847/13.730570\u003c/td\u003e\u003ctd\u003e3458.765537/12.176166\u003c/td\u003e\u003ctd\u003e1428.351000/11.139896\u003c/td\u003e\u003ctd\u003e4890.406327/1.813472\u003c/td\u003e\u003ctd\u003e2050.215975/11.917098\u003c/td\u003e\u003ctd\u003e4693.132443/2.331606\u003c/td\u003e\u003ctd\u003e3796.911033/9.844560\u003c/td\u003e\u003ctd\u003e4985.892435/7.253886\u003c/td\u003e\u003ctd\u003e3737.211837/11.658031\u003c/td\u003e\u003ctd\u003e8497.461052/1.036269\u003c/td\u003e\u003ctd\u003e8105.614715/2.590674\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eGhom\u003c/th\u003e\u003ctd\u003e10826.769423/12.176166\u003c/td\u003e\u003ctd\u003e7919.745037/10.621762\u003c/td\u003e\u003ctd\u003e13681.624683/6.735751\u003c/td\u003e\u003ctd\u003e112.759549/22.538860\u003c/td\u003e\u003ctd\u003e8550.764036/13.212435\u003c/td\u003e\u003ctd\u003e21351.213307/11.658031\u003c/td\u003e\u003ctd\u003e\u003cb\u003e5724.234345/11.917098\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e7638.186054/10.621762\u003c/td\u003e\u003ctd\u003e8992.791640/6.735751\u003c/td\u003e\u003ctd\u003e9870.502751/5.440415\u003c/td\u003e\u003ctd\u003e8671.271306/14.248705\u003c/td\u003e\u003ctd\u003e7952.305962/9.844560\u003c/td\u003e\u003ctd\u003e17073.248866/7.253886\u003c/td\u003e\u003ctd\u003e17507.383398/3.626943\u003c/td\u003e\u003ctd\u003e6253.188979/12.435233\u003c/td\u003e\u003ctd\u003e6616.060359/9.585492\u003c/td\u003e\u003ctd\u003e31826.000072/3.108808\u003c/td\u003e\u003ctd\u003e11636.816092/11.398964\u003c/td\u003e\u003ctd\u003e6129.150512/14.507772\u003c/td\u003e\u003ctd\u003e9667.854370/11.139896\u003c/td\u003e\u003ctd\u003e14276.187678/8.031088\u003c/td\u003e\u003ctd\u003e7152.109226/12.953368\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eLimb\u003c/th\u003e\u003ctd\u003e2348.605310/7.772021\u003c/td\u003e\u003ctd\u003e5910.088736/10.103627\u003c/td\u003e\u003ctd\u003e11640.836610/2.331606\u003c/td\u003e\u003ctd\u003e2234.982947/8.031088\u003c/td\u003e\u003ctd\u003e16.486114/48.186528\u003c/td\u003e\u003ctd\u003e5240.029343/10.880829\u003c/td\u003e\u003ctd\u003e3485.743598/11.139896\u003c/td\u003e\u003ctd\u003e\u003cb\u003e1744.289850/10.880829\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e2357.786346/11.658031\u003c/td\u003e\u003ctd\u003e2829.453145/10.362694\u003c/td\u003e\u003ctd\u003e6097.658965/6.735751\u003c/td\u003e\u003ctd\u003e2806.032546/9.326425\u003c/td\u003e\u003ctd\u003e2530.422427/11.139896\u003c/td\u003e\u003ctd\u003e2234.037369/14.507772\u003c/td\u003e\u003ctd\u003e3106.412553/9.067358\u003c/td\u003e\u003ctd\u003e5675.990382/8.549223\u003c/td\u003e\u003ctd\u003e4323.215519/10.880829\u003c/td\u003e\u003ctd\u003e5303.094881/7.512953\u003c/td\u003e\u003ctd\u003e3222.476499/10.362694\u003c/td\u003e\u003ctd\u003e2619.771393/12.435233\u003c/td\u003e\u003ctd\u003e6315.916126/12.435233\u003c/td\u003e\u003ctd\u003e1965.282639/9.326425\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eNgie\u003c/th\u003e\u003ctd\u003e2494.668579/10.621762\u003c/td\u003e\u003ctd\u003e1683.610004/7.772021\u003c/td\u003e\u003ctd\u003e\u003cb\u003e645.074490/13.212435\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e2747.857945/10.621762\u003c/td\u003e\u003ctd\u003e865.229192/8.031088\u003c/td\u003e\u003ctd\u003e53.604331/32.642487\u003c/td\u003e\u003ctd\u003e3487.877577/5.440415\u003c/td\u003e\u003ctd\u003e2973.100164/9.844560\u003c/td\u003e\u003ctd\u003e1694.041692/9.844560\u003c/td\u003e\u003ctd\u003e2285.872589/8.808290\u003c/td\u003e\u003ctd\u003e3555.658122/3.626943\u003c/td\u003e\u003ctd\u003e2240.803918/4.663212\u003c/td\u003e\u003ctd\u003e8214.745127/2.849741\u003c/td\u003e\u003ctd\u003e2162.964776/8.290155\u003c/td\u003e\u003ctd\u003e4130.931993/5.699482\u003c/td\u003e\u003ctd\u003e1251.907556/9.585492\u003c/td\u003e\u003ctd\u003e1406.624816/6.735751\u003c/td\u003e\u003ctd\u003e1134.593481/8.031088\u003c/td\u003e\u003ctd\u003e3484.481404/9.844560\u003c/td\u003e\u003ctd\u003e1587.951832/9.326425\u003c/td\u003e\u003ctd\u003e1786.015603/9.326425\u003c/td\u003e\u003ctd\u003e2117.031454/10.103627\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eDii\u003c/th\u003e\u003ctd\u003e5369.974508/5.181347\u003c/td\u003e\u003ctd\u003e3526.951377/11.917098\u003c/td\u003e\u003ctd\u003e4466.736657/2.590674\u003c/td\u003e\u003ctd\u003e3468.181916/8.808290\u003c/td\u003e\u003ctd\u003e1524.457754/10.880829\u003c/td\u003e\u003ctd\u003e\u003cb\u003e856.533233/10.362694\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e16.031832/47.150259\u003c/td\u003e\u003ctd\u003e3570.945172/11.658031\u003c/td\u003e\u003ctd\u003e1933.128270/11.139896\u003c/td\u003e\u003ctd\u003e3086.805425/7.253886\u003c/td\u003e\u003ctd\u003e5545.945984/3.626943\u003c/td\u003e\u003ctd\u003e1592.451661/11.139896\u003c/td\u003e\u003ctd\u003e7351.154713/2.331606\u003c/td\u003e\u003ctd\u003e1430.511351/14.248705\u003c/td\u003e\u003ctd\u003e4198.900876/4.145078\u003c/td\u003e\u003ctd\u003e2587.338616/8.290155\u003c/td\u003e\u003ctd\u003e3315.158358/2.590674\u003c/td\u003e\u003ctd\u003e2903.721453/8.808290\u003c/td\u003e\u003ctd\u003e4416.753252/3.886010\u003c/td\u003e\u003ctd\u003e3044.769713/5.440415\u003c/td\u003e\u003ctd\u003e3276.637193/10.362694\u003c/td\u003e\u003ctd\u003e3551.309415/8.808290\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eDoya\u003c/th\u003e\u003ctd\u003e2413.178389/7.253886\u003c/td\u003e\u003ctd\u003e2925.237118/9.326425\u003c/td\u003e\u003ctd\u003e3035.126064/9.844560\u003c/td\u003e\u003ctd\u003e6431.020717/4.404145\u003c/td\u003e\u003ctd\u003e2888.802299/10.362694\u003c/td\u003e\u003ctd\u003e4296.348738/9.585492\u003c/td\u003e\u003ctd\u003e1963.357861/9.067358\u003c/td\u003e\u003ctd\u003e225.399738/14.507772\u003c/td\u003e\u003ctd\u003e2647.241446/4.663212\u003c/td\u003e\u003ctd\u003e3559.797389/1.036269\u003c/td\u003e\u003ctd\u003e3224.327707/8.549223\u003c/td\u003e\u003ctd\u003e1628.560179/16.062176\u003c/td\u003e\u003ctd\u003e7036.636934/2.072539\u003c/td\u003e\u003ctd\u003e2378.384535/7.772021\u003c/td\u003e\u003ctd\u003e2526.667089/10.103627\u003c/td\u003e\u003ctd\u003e2560.562728/10.362694\u003c/td\u003e\u003ctd\u003e3486.425933/7.253886\u003c/td\u003e\u003ctd\u003e4898.016349/6.217617\u003c/td\u003e\u003ctd\u003e\u003cb\u003e1336.163366/12.176166\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e5378.777228/0.518135\u003c/td\u003e\u003ctd\u003e2334.347220/9.585492\u003c/td\u003e\u003ctd\u003e4210.426671/6.476684\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003ePeer\u003c/th\u003e\u003ctd\u003e5417.812131/7.253886\u003c/td\u003e\u003ctd\u003e3718.857566/8.290155\u003c/td\u003e\u003ctd\u003e3921.429577/10.103627\u003c/td\u003e\u003ctd\u003e8042.333854/2.590674\u003c/td\u003e\u003ctd\u003e4744.329113/12.435233\u003c/td\u003e\u003ctd\u003e2378.606152/7.772021\u003c/td\u003e\u003ctd\u003e4297.265443/7.253886\u003c/td\u003e\u003ctd\u003e7835.525318/3.108808\u003c/td\u003e\u003ctd\u003e27.612503/46.113990\u003c/td\u003e\u003ctd\u003e8547.481994/3.367876\u003c/td\u003e\u003ctd\u003e7819.217930/4.922280\u003c/td\u003e\u003ctd\u003e\u003cb\u003e2009.553562/13.730570\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e7929.664487/2.590674\u003c/td\u003e\u003ctd\u003e5227.466016/3.108808\u003c/td\u003e\u003ctd\u003e2828.595071/10.103627\u003c/td\u003e\u003ctd\u003e3109.933571/11.398964\u003c/td\u003e\u003ctd\u003e3449.171674/7.512953\u003c/td\u003e\u003ctd\u003e7517.809582/5.181347\u003c/td\u003e\u003ctd\u003e3593.460649/9.326425\u003c/td\u003e\u003ctd\u003e6490.444215/5.181347\u003c/td\u003e\u003ctd\u003e8583.548031/6.994819\u003c/td\u003e\u003ctd\u003e3640.649700/9.585492\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eSamb\u003c/th\u003e\u003ctd\u003e1921.203126/10.621762\u003c/td\u003e\u003ctd\u003e2876.156252/8.808290\u003c/td\u003e\u003ctd\u003e5222.268404/2.331606\u003c/td\u003e\u003ctd\u003e2258.419159/8.808290\u003c/td\u003e\u003ctd\u003e2940.603464/9.844560\u003c/td\u003e\u003ctd\u003e\u003cb\u003e757.885957/10.362694\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e2852.564926/3.886010\u003c/td\u003e\u003ctd\u003e3568.046199/9.585492\u003c/td\u003e\u003ctd\u003e3198.132105/11.658031\u003c/td\u003e\u003ctd\u003e14.473909/45.336788\u003c/td\u003e\u003ctd\u003e2135.946491/9.326425\u003c/td\u003e\u003ctd\u003e1882.791510/12.435233\u003c/td\u003e\u003ctd\u003e1380.449126/12.694301\u003c/td\u003e\u003ctd\u003e2739.728389/6.217617\u003c/td\u003e\u003ctd\u003e1114.151589/13.989637\u003c/td\u003e\u003ctd\u003e2588.952886/10.362694\u003c/td\u003e\u003ctd\u003e2408.673909/9.844560\u003c/td\u003e\u003ctd\u003e1012.804391/13.471503\u003c/td\u003e\u003ctd\u003e4310.704371/6.217617\u003c/td\u003e\u003ctd\u003e2429.426652/3.108808\u003c/td\u003e\u003ctd\u003e1681.603952/7.772021\u003c/td\u003e\u003ctd\u003e2305.207465/4.404145\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eGuid\u003c/th\u003e\u003ctd\u003e11105.869490/11.917098\u003c/td\u003e\u003ctd\u003e11350.393050/8.549223\u003c/td\u003e\u003ctd\u003e24157.732815/2.331606\u003c/td\u003e\u003ctd\u003e28800.139343/5.440415\u003c/td\u003e\u003ctd\u003e9497.473893/11.139896\u003c/td\u003e\u003ctd\u003e11941.642599/11.658031\u003c/td\u003e\u003ctd\u003e26891.060403/2.072539\u003c/td\u003e\u003ctd\u003e35288.834478/3.367876\u003c/td\u003e\u003ctd\u003e11458.390164/9.326425\u003c/td\u003e\u003ctd\u003e8581.012321/12.953368\u003c/td\u003e\u003ctd\u003e669.152371/22.020725\u003c/td\u003e\u003ctd\u003e\u003cb\u003e8237.415053/12.953368\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e24641.309182/3.626943\u003c/td\u003e\u003ctd\u003e12256.261503/6.735751\u003c/td\u003e\u003ctd\u003e8329.239657/15.025907\u003c/td\u003e\u003ctd\u003e18733.469719/2.590674\u003c/td\u003e\u003ctd\u003e13013.633062/11.398964\u003c/td\u003e\u003ctd\u003e22151.485850/4.922280\u003c/td\u003e\u003ctd\u003e15139.079118/12.176166\u003c/td\u003e\u003ctd\u003e12649.997596/11.139896\u003c/td\u003e\u003ctd\u003e13526.708187/9.844560\u003c/td\u003e\u003ctd\u003e14521.723680/13.471503\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eGuiz\u003c/th\u003e\u003ctd\u003e1900.984819/11.917098\u003c/td\u003e\u003ctd\u003e3422.299591/5.440415\u003c/td\u003e\u003ctd\u003e2920.779863/13.212435\u003c/td\u003e\u003ctd\u003e2657.232975/3.886010\u003c/td\u003e\u003ctd\u003e7763.772745/6.217617\u003c/td\u003e\u003ctd\u003e2516.088934/11.398964\u003c/td\u003e\u003ctd\u003e1556.474440/12.953368\u003c/td\u003e\u003ctd\u003e\u003cb\u003e1450.939238/12.694301\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e1852.263760/12.435233\u003c/td\u003e\u003ctd\u003e3503.139397/5.440415\u003c/td\u003e\u003ctd\u003e1957.981930/7.772021\u003c/td\u003e\u003ctd\u003e5.612643/60.362694\u003c/td\u003e\u003ctd\u003e2030.975178/10.621762\u003c/td\u003e\u003ctd\u003e3100.456750/9.585492\u003c/td\u003e\u003ctd\u003e3816.057439/9.067358\u003c/td\u003e\u003ctd\u003e2527.372931/10.103627\u003c/td\u003e\u003ctd\u003e2017.135324/9.585492\u003c/td\u003e\u003ctd\u003e1771.010720/12.953368\u003c/td\u003e\u003ctd\u003e2467.262902/9.067358\u003c/td\u003e\u003ctd\u003e6465.542228/6.735751\u003c/td\u003e\u003ctd\u003e4936.521836/5.181347\u003c/td\u003e\u003ctd\u003e3251.372451/4.663212\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eKaps\u003c/th\u003e\u003ctd\u003e4787.151015/7.772021\u003c/td\u003e\u003ctd\u003e4026.495938/9.067358\u003c/td\u003e\u003ctd\u003e2591.212157/13.730570\u003c/td\u003e\u003ctd\u003e3963.789278/11.139896\u003c/td\u003e\u003ctd\u003e4835.168698/9.844560\u003c/td\u003e\u003ctd\u003e3738.018788/5.958549\u003c/td\u003e\u003ctd\u003e3472.599548/9.067358\u003c/td\u003e\u003ctd\u003e2846.824328/9.067358\u003c/td\u003e\u003ctd\u003e3964.442923/6.217617\u003c/td\u003e\u003ctd\u003e8248.174848/4.663212\u003c/td\u003e\u003ctd\u003e3178.776910/9.326425\u003c/td\u003e\u003ctd\u003e4521.187784/6.476684\u003c/td\u003e\u003ctd\u003e6.392693/63.730570\u003c/td\u003e\u003ctd\u003e4535.673748/6.476684\u003c/td\u003e\u003ctd\u003e2285.708359/13.730570\u003c/td\u003e\u003ctd\u003e5222.426332/5.699482\u003c/td\u003e\u003ctd\u003e4409.982716/5.440415\u003c/td\u003e\u003ctd\u003e\u003cb\u003e2124.534904/10.362694\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e4863.209844/10.362694\u003c/td\u003e\u003ctd\u003e4875.780156/3.886010\u003c/td\u003e\u003ctd\u003e4278.744225/12.176166\u003c/td\u003e\u003ctd\u003e4661.710772/9.067358\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eMofa\u003c/th\u003e\u003ctd\u003e5555.267163/7.772021\u003c/td\u003e\u003ctd\u003e5328.793555/11.658031\u003c/td\u003e\u003ctd\u003e6064.913246/13.730570\u003c/td\u003e\u003ctd\u003e8844.481560/5.181347\u003c/td\u003e\u003ctd\u003e14355.051790/6.217617\u003c/td\u003e\u003ctd\u003e10773.098216/8.290155\u003c/td\u003e\u003ctd\u003e5702.554716/11.398964\u003c/td\u003e\u003ctd\u003e11819.967712/5.958549\u003c/td\u003e\u003ctd\u003e5810.652609/12.435233\u003c/td\u003e\u003ctd\u003e10899.166334/6.476684\u003c/td\u003e\u003ctd\u003e9606.038800/5.699482\u003c/td\u003e\u003ctd\u003e\u003cb\u003e4528.077873/13.471503\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e10261.988658/9.844560\u003c/td\u003e\u003ctd\u003e38.718690/38.341969\u003c/td\u003e\u003ctd\u003e7191.371927/8.290155\u003c/td\u003e\u003ctd\u003e4847.594375/14.248705\u003c/td\u003e\u003ctd\u003e8110.295270/9.844560\u003c/td\u003e\u003ctd\u003e14375.814958/5.699482\u003c/td\u003e\u003ctd\u003e10070.806870/3.626943\u003c/td\u003e\u003ctd\u003e10826.318474/8.290155\u003c/td\u003e\u003ctd\u003e10187.374717/7.772021\u003c/td\u003e\u003ctd\u003e16953.170797/3.626943\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eMofu\u003c/th\u003e\u003ctd\u003e2175.168540/11.658031\u003c/td\u003e\u003ctd\u003e3005.393159/10.621762\u003c/td\u003e\u003ctd\u003e2773.793897/7.253886\u003c/td\u003e\u003ctd\u003e2257.313709/6.476684\u003c/td\u003e\u003ctd\u003e1807.203325/13.471503\u003c/td\u003e\u003ctd\u003e2481.194623/2.331606\u003c/td\u003e\u003ctd\u003e1626.688315/12.435233\u003c/td\u003e\u003ctd\u003e1473.207901/13.212435\u003c/td\u003e\u003ctd\u003e3206.638463/8.290155\u003c/td\u003e\u003ctd\u003e\u003cb\u003e1358.112972/12.435233\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e2550.513183/10.880829\u003c/td\u003e\u003ctd\u003e1867.275865/12.694301\u003c/td\u003e\u003ctd\u003e2847.897967/4.145078\u003c/td\u003e\u003ctd\u003e1645.699003/13.471503\u003c/td\u003e\u003ctd\u003e50.399227/32.642487\u003c/td\u003e\u003ctd\u003e3831.820284/3.108808\u003c/td\u003e\u003ctd\u003e1679.421861/9.844560\u003c/td\u003e\u003ctd\u003e1957.944241/13.989637\u003c/td\u003e\u003ctd\u003e1655.398024/13.212435\u003c/td\u003e\u003ctd\u003e3439.753108/6.735751\u003c/td\u003e\u003ctd\u003e4164.392749/9.844560\u003c/td\u003e\u003ctd\u003e2176.478824/10.103627\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eDu_n\u003c/th\u003e\u003ctd\u003e3358.977688/12.694301\u003c/td\u003e\u003ctd\u003e8269.025689/5.958549\u003c/td\u003e\u003ctd\u003e6784.926221/4.922280\u003c/td\u003e\u003ctd\u003e4034.987828/10.362694\u003c/td\u003e\u003ctd\u003e8317.977821/5.440415\u003c/td\u003e\u003ctd\u003e4469.988388/9.326425\u003c/td\u003e\u003ctd\u003e4581.242219/9.585492\u003c/td\u003e\u003ctd\u003e4046.289387/10.880829\u003c/td\u003e\u003ctd\u003e4587.843666/10.880829\u003c/td\u003e\u003ctd\u003e4061.430238/12.435233\u003c/td\u003e\u003ctd\u003e4116.231632/8.031088\u003c/td\u003e\u003ctd\u003e4043.687467/11.658031\u003c/td\u003e\u003ctd\u003e8587.884922/5.699482\u003c/td\u003e\u003ctd\u003e\u003cb\u003e2518.760103/13.989637\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e9252.838415/6.217617\u003c/td\u003e\u003ctd\u003e38.646292/34.196891\u003c/td\u003e\u003ctd\u003e2823.000209/11.658031\u003c/td\u003e\u003ctd\u003e7688.259347/5.699482\u003c/td\u003e\u003ctd\u003e4184.395191/9.844560\u003c/td\u003e\u003ctd\u003e6460.323149/9.844560\u003c/td\u003e\u003ctd\u003e12418.880207/5.699482\u003c/td\u003e\u003ctd\u003e4394.753911/10.362694\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eEjag\u003c/th\u003e\u003ctd\u003e878.221181/8.290155\u003c/td\u003e\u003ctd\u003e2977.854246/10.362694\u003c/td\u003e\u003ctd\u003e1122.454274/13.212435\u003c/td\u003e\u003ctd\u003e4066.806240/3.626943\u003c/td\u003e\u003ctd\u003e4401.408293/12.694301\u003c/td\u003e\u003ctd\u003e1324.839235/11.139896\u003c/td\u003e\u003ctd\u003e2760.972117/9.585492\u003c/td\u003e\u003ctd\u003e802.718089/8.808290\u003c/td\u003e\u003ctd\u003e1935.328428/6.735751\u003c/td\u003e\u003ctd\u003e2456.134064/8.549223\u003c/td\u003e\u003ctd\u003e948.726346/11.658031\u003c/td\u003e\u003ctd\u003e1464.326862/6.994819\u003c/td\u003e\u003ctd\u003e1999.633312/6.476684\u003c/td\u003e\u003ctd\u003e2483.815842/4.663212\u003c/td\u003e\u003ctd\u003e790.752998/11.917098\u003c/td\u003e\u003ctd\u003e1436.471564/10.362694\u003c/td\u003e\u003ctd\u003e27.125567/39.896373\u003c/td\u003e\u003ctd\u003e2701.314483/8.549223\u003c/td\u003e\u003ctd\u003e\u003cb\u003e739.895562/13.989637\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e1119.207373/9.844560\u003c/td\u003e\u003ctd\u003e2061.967307/3.367876\u003c/td\u003e\u003ctd\u003e3116.635849/4.663212\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eFulf\u003c/th\u003e\u003ctd\u003e3122.754082/11.139896\u003c/td\u003e\u003ctd\u003e3172.412810/8.290155\u003c/td\u003e\u003ctd\u003e2632.034499/10.103627\u003c/td\u003e\u003ctd\u003e1803.237123/14.507772\u003c/td\u003e\u003ctd\u003e3015.507576/12.953368\u003c/td\u003e\u003ctd\u003e4697.430105/10.621762\u003c/td\u003e\u003ctd\u003e2221.398811/11.917098\u003c/td\u003e\u003ctd\u003e3338.511704/7.772021\u003c/td\u003e\u003ctd\u003e5857.163684/4.663212\u003c/td\u003e\u003ctd\u003e2631.329961/12.694301\u003c/td\u003e\u003ctd\u003e\u003cb\u003e1756.767457/14.248705\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e3965.216351/8.031088\u003c/td\u003e\u003ctd\u003e2961.580251/10.362694\u003c/td\u003e\u003ctd\u003e1850.532804/14.248705\u003c/td\u003e\u003ctd\u003e2431.677037/8.808290\u003c/td\u003e\u003ctd\u003e2688.040706/8.549223\u003c/td\u003e\u003ctd\u003e6237.846441/3.108808\u003c/td\u003e\u003ctd\u003e9.819160/53.108808\u003c/td\u003e\u003ctd\u003e1794.314668/12.435233\u003c/td\u003e\u003ctd\u003e2633.154009/4.922280\u003c/td\u003e\u003ctd\u003e5899.732260/9.585492\u003c/td\u003e\u003ctd\u003e6035.594459/5.440415\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eGbay\u003c/th\u003e\u003ctd\u003e3537.010215/8.808290\u003c/td\u003e\u003ctd\u003e2213.336729/9.326425\u003c/td\u003e\u003ctd\u003e958.976958/14.766839\u003c/td\u003e\u003ctd\u003e2170.105117/2.849741\u003c/td\u003e\u003ctd\u003e2381.840897/8.549223\u003c/td\u003e\u003ctd\u003e1092.011356/11.398964\u003c/td\u003e\u003ctd\u003e989.079405/15.284974\u003c/td\u003e\u003ctd\u003e2110.708219/12.953368\u003c/td\u003e\u003ctd\u003e1212.493865/13.989637\u003c/td\u003e\u003ctd\u003e1342.159428/12.953368\u003c/td\u003e\u003ctd\u003e\u003cb\u003e784.478130/16.321244\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e1404.757907/15.284974\u003c/td\u003e\u003ctd\u003e1949.759014/13.730570\u003c/td\u003e\u003ctd\u003e1165.979838/12.694301\u003c/td\u003e\u003ctd\u003e1940.255308/5.699482\u003c/td\u003e\u003ctd\u003e1073.951745/13.730570\u003c/td\u003e\u003ctd\u003e2180.263932/7.253886\u003c/td\u003e\u003ctd\u003e2639.229412/8.031088\u003c/td\u003e\u003ctd\u003e4.503568/64.766839\u003c/td\u003e\u003ctd\u003e2711.475687/5.440415\u003c/td\u003e\u003ctd\u003e2879.142805/11.139896\u003c/td\u003e\u003ctd\u003e2777.515280/3.626943\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eMASS\u003c/th\u003e\u003ctd\u003e2052.763675/6.476684\u003c/td\u003e\u003ctd\u003e2123.090411/11.139896\u003c/td\u003e\u003ctd\u003e1150.690864/11.398964\u003c/td\u003e\u003ctd\u003e\u003cb\u003e404.857470/19.170984\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e4114.380214/2.849741\u003c/td\u003e\u003ctd\u003e1177.460159/10.880829\u003c/td\u003e\u003ctd\u003e1553.261634/11.917098\u003c/td\u003e\u003ctd\u003e767.332823/13.212435\u003c/td\u003e\u003ctd\u003e1558.036793/6.217617\u003c/td\u003e\u003ctd\u003e673.483311/13.730570\u003c/td\u003e\u003ctd\u003e1308.799442/6.735751\u003c/td\u003e\u003ctd\u003e2525.700131/5.440415\u003c/td\u003e\u003ctd\u003e1157.282835/14.248705\u003c/td\u003e\u003ctd\u003e1665.795367/8.031088\u003c/td\u003e\u003ctd\u003e969.622799/11.139896\u003c/td\u003e\u003ctd\u003e2236.251124/10.621762\u003c/td\u003e\u003ctd\u003e1768.310288/9.585492\u003c/td\u003e\u003ctd\u003e1530.460913/10.621762\u003c/td\u003e\u003ctd\u003e703.513823/14.766839\u003c/td\u003e\u003ctd\u003e9.311520/52.072539\u003c/td\u003e\u003ctd\u003e3781.478640/5.440415\u003c/td\u003e\u003ctd\u003e783.170102/16.580311\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eTupu\u003c/th\u003e\u003ctd\u003e499.010245/24.611399\u003c/td\u003e\u003ctd\u003e2789.182977/9.844560\u003c/td\u003e\u003ctd\u003e1176.557896/16.062176\u003c/td\u003e\u003ctd\u003e335.366353/21.243523\u003c/td\u003e\u003ctd\u003e3759.854817/4.922280\u003c/td\u003e\u003ctd\u003e1473.248900/8.290155\u003c/td\u003e\u003ctd\u003e1637.969909/15.284974\u003c/td\u003e\u003ctd\u003e444.487258/23.056995\u003c/td\u003e\u003ctd\u003e729.184899/19.430052\u003c/td\u003e\u003ctd\u003e326.348924/24.611399\u003c/td\u003e\u003ctd\u003e530.140976/24.611399\u003c/td\u003e\u003ctd\u003e834.757176/20.207254\u003c/td\u003e\u003ctd\u003e1014.747872/11.398964\u003c/td\u003e\u003ctd\u003e1361.103340/11.398964\u003c/td\u003e\u003ctd\u003e447.754239/17.875648\u003c/td\u003e\u003ctd\u003e1313.622745/15.803109\u003c/td\u003e\u003ctd\u003e2020.767969/9.326425\u003c/td\u003e\u003ctd\u003e1234.031067/13.730570\u003c/td\u003e\u003ctd\u003e\u003cb\u003e242.696296/29.533679\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e1209.709716/14.766839\u003c/td\u003e\u003ctd\u003e5.328121/62.953368\u003c/td\u003e\u003ctd\u003e678.820813/13.730570\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth scope='row'\u003eVute\u003c/th\u003e\u003ctd\u003e5247.001730/8.290155\u003c/td\u003e\u003ctd\u003e2972.688386/11.398964\u003c/td\u003e\u003ctd\u003e3141.040872/9.067358\u003c/td\u003e\u003ctd\u003e4304.014532/12.435233\u003c/td\u003e\u003ctd\u003e2981.350915/10.880829\u003c/td\u003e\u003ctd\u003e7944.078280/2.331606\u003c/td\u003e\u003ctd\u003e3013.186151/13.730570\u003c/td\u003e\u003ctd\u003e2532.120943/12.176166\u003c/td\u003e\u003ctd\u003e4688.069751/9.844560\u003c/td\u003e\u003ctd\u003e8022.399859/3.886010\u003c/td\u003e\u003ctd\u003e5315.095277/3.626943\u003c/td\u003e\u003ctd\u003e\u003cb\u003e2075.166168/12.694301\u003c/b\u003e\u003c/td\u003e\u003ctd\u003e3794.597938/12.176166\u003c/td\u003e\u003ctd\u003e2879.870276/13.212435\u003c/td\u003e\u003ctd\u003e4364.837110/3.367876\u003c/td\u003e\u003ctd\u003e3858.872867/8.549223\u003c/td\u003e\u003ctd\u003e2749.070864/10.880829\u003c/td\u003e\u003ctd\u003e9917.265191/3.367876\u003c/td\u003e\u003ctd\u003e8091.176547/3.108808\u003c/td\u003e\u003ctd\u003e5939.386425/4.404145\u003c/td\u003e\u003ctd\u003e7670.501815/2.849741\u003c/td\u003e\u003ctd\u003e43.658700/33.419689\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\n\n###### Prerequisite\nIf you want to evaluate the LM on a language `lang`, you must first have a file named `lang.txt` in the `$src_path` directory of [eval_data.sh](eval_data.sh).  \nFor examplel if you want to use the biblical corpus, you can run [scripts/bible.py](scripts/bible.py) :\n```\n# folder containing the csvs folder\ncsv_path=\n# folder in which the objective folders will be created (mono or para)\noutput_dir=\n# monolingual one (\"mono\") or parallel one (\"para\")\ndata_type=mono\n# list of languages to be considered in alphabetical order and separated by a comma\n# case of one language\nlanguages=lang,lang  \n# case of many languages\nlanguages=lang1,lang2,...   \nold_only : use only old testament\n#  use only new testament\nnew_only=True\n\npython ../scripts/bible.py --csv_path $csv_path --output_dir $output_dir --data_type $data_type --languages $languages --new_only $new_only\n```\nSee other parameters in [scripts/bible.py](scripts/bible.py)\n\n###### Data pre-processing\nModify parameters in [eval_data.sh](eval_data.sh)\n```\n# languages to be evaluated\nlanguages=lang1,lang2,... \nchmod +x ../eval_data.sh \n../eval_data.sh $languages\n```\n\n###### Evaluation \n\nWe take the language to evaluate (say `Bulu`), replace the files `test.Bulu.pth` (which was created with the `VOCAB` and `CODE` of `Bafi`, the evaluating language) with `test.Bafi.pth` (since `Bafi` evaluates, the `train.py` script requires that the dataset has the (part of the) name of the `lgs`). Then we just run the evaluation, the results (acc and ppl) we get is the result of LM Bafia on the Bulu language.\n\n```\n# evaluating language\ntgt_pair=\n# folder containing the data to be evaluated (must match $tgt_path in eval_data.sh)\nsrc_path=\n# You have to change two parameters in the configuration file used to train the LM which evaluates (\"data_path\":\"$src_path\" and \"eval_only\": \"True\")\n# You must also specify the \"reload_model\" parameter, otherwise the last checkpoint found will be loaded for evaluation.\nconfig_file=../configs/lm_template.json \n# languages to be evaluated\neval_lang= \nchmod +x ../scripts/evaluate.sh\n../scripts/evaluate.sh $eval_lang\n```\nWhen the evaluation is finished you will see a file named `eval.log` in the `$dump_path/$exp_name/$exp_id` folder containing the evaluation results.    \n**Note** :The description given below is only valid when the LM evaluator has been trained on only one language (and therefore without TLM). But let's consider the case where the basic LM has been trained on `en-fr` and we want to evaluate it on `de` or `de-ru`. `$tgt_pair` deviates from `en-fr`, but `language` varies depending on whether the evaluation is going to be done on one language or two:  \n- In the case of `de` : `lang=de-de`  \n- in the case of `de-ru`: `lang=de-ru`.\n\n## IV. References\n\nPlease cite [[1]](https://openreview.net/forum?id=Q5ZxoD2LqcI) and/or  [[2]](https://arxiv.org/abs/1901.07291) and/or [[3]](https://arxiv.org/abs/1911.02116) if you found the resources in this repository useful.\n\n### On the use of linguistic similarities to improve Neural Machine Translation for African Languages\n\n[1] Tikeng Notsawo Pascal, NANDA ASSOBJIO Brice Yvan and James Assiene\n```\n@misc{\npascal2021on,\ntitle={On the use of linguistic similarities to improve Neural Machine Translation for African Languages},\nauthor={Tikeng Notsawo Pascal and NANDA ASSOBJIO Brice Yvan and James Assiene},\nyear={2021},\nurl={https://openreview.net/forum?id=Q5ZxoD2LqcI}\n}\n```\n\n### Cross-lingual Language Model Pretraining\n\n[2] G. Lample *, A. Conneau * [*Cross-lingual Language Model Pretraining*](https://arxiv.org/abs/1901.07291) and [facebookresearch/XLM](https://github.com/facebookresearch/XLM)\n\n\\* Equal contribution. Order has been determined with a coin flip.\n\n```\n@article{lample2019cross,\n  title={Cross-lingual Language Model Pretraining},\n  author={Lample, Guillaume and Conneau, Alexis},\n  journal={Advances in Neural Information Processing Systems (NeurIPS)},\n  year={2019}\n}\n```\n\n### Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks\n\n[3] Chelsea Finn, Pieter Abbeel, Sergey Levine [*Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks*](https://arxiv.org/abs/1911.02116) and [cbfinn/maml](https://github.com/cbfinn/maml)\n\n```\n@article{Chelsea et al.,\n  title={Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks},\n  author={Chelsea Finn, Pieter Abbeel, Sergey Levine},\n  journal={Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017},\n  year={2017}\n}\n```\n\n## License\n\nSee the [LICENSE](LICENSE) file for more details.\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftikquuss%2Fmeta_xlm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftikquuss%2Fmeta_xlm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftikquuss%2Fmeta_xlm/lists"}