https://github.com/thesofakillers/claficle

Official repository for the paper "CLAfICLe: Cross-Lingual Adaptation for In-Context Learning". Not Published.
https://github.com/thesofakillers/claficle

adapters gpt2 multilingual-nlp nlp python transformers

Last synced: 18 days ago
JSON representation

Official repository for the paper "CLAfICLe: Cross-Lingual Adaptation for In-Context Learning". Not Published.

Host: GitHub
URL: https://github.com/thesofakillers/claficle
Owner: thesofakillers
License: mit
Created: 2022-05-12T09:15:44.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2023-06-28T10:30:51.000Z (about 3 years ago)
Last Synced: 2025-10-19T21:45:06.614Z (9 months ago)
Topics: adapters, gpt2, multilingual-nlp, nlp, python, transformers
Language: TeX
Homepage:
Size: 13.9 MB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # CLAfICLe

Read our paper:

[Cross-Lingual Adaptation for In-Context Learning [PDF]](./reports/report/main.pdf) (Not submitted for publication)

**Contents**

* [Requirements and Setup](#requirements-and-setup)

    * [Required Packages](#required-packages)

    * [Checkpoints](#checkpoints)

* [Model Reference](#model-reference)

* [Usage](#usage)

* [Project Organization](#project-organization)

## Requirements and Setup

### Required Packages

Details such as python and package versions can be found in the generated

[pyproject.toml](pyproject.toml) and [poetry.lock](poetry.lock) files.

We recommend using an environment manager such as

[conda](https://docs.conda.io/en/latest/). After setting up your environment

with the correct python version, please proceed with the installation of the

required packages

For [poetry](https://python-poetry.org/) users, getting setup is as easy as

running

```terminal

poetry install

```

We also provide a [requirements.txt](requirements.txt) file for

[pip](https://pypi.org/project/pip/) users who do not wish to use poetry. In

this case, simply run

```terminal

pip install -r requirements.txt

```

This `requirements.txt` file is generated by running the following

```terminal

sh gen_pip_reqs.sh

```

### Checkpoints

If you wish to run evaluation without first training the model, we provide our

checkpoints via [The Internet Archive](https://archive.org/) at

[this link](https://archive.org/download/claficle/checkpoints.zip). Please unzip

this folder and organize it such that the checkpoints are in the `checkpoints`

folder at the root of this repository.

We do not provide the bare `hr_to_lr` MetaICL model checkpoint. For this

checkpoint, please refer to the instructions on the

[MetaICL repo](https://github.com/facebookresearch/MetaICL) for downloading

their `metaicl` model in the `hr_to_lr` setting. Once downloaded, rename this to

`metaicl.pt` and place it in the relevant checkpoints directory.

## Model Reference

The following table provides a reference for the models evaluated in our paper.

| **Model Name**                  | **Evaluation Languages** | **Description**                                                                                                                                                                                    |

| ------------------------------- | ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |

| `metaicl`                       | en                       | `direct` `hr_to_lr` checkpoint from the [MetaICL repo](https://github.com/facebookresearch/MetaICL)                                                                                                |

| `sandwich-{lang}`               | fr, de                   | `metaicl` sandwiched in a translation API for `lang`, serving as a baseline                                                                                                                        |

| `metaicl-gewechselt-{lang}-clm` | fr, de                   | `metaicl` adapted to a `lang` (fr or de) using [WECHSEL](https://github.com/CPJKU/wechsel), 0 shot or with the additional recommended CLM training.                                                |

| `gpt2-gewechselt-{lang}-clm`    | not evaluated            | `gpt2` adapted to `lang` (fr or de) using [WECHSEL](https://github.com/CPJKU/wechsel) with additional recommended CLM training. Note, we do not actually evaluate this buut only use it as a base. |

| `{base}-metaicla`               | fr, de                   | A `base` (any of the `gpt2-gewechselt-{lang}-clm`) with a MetaICL adapter, trained the standard way.                                                                                               |

| `{base}-metaiclva`              | fr, de                   | A `base` (any of the `gpt2-gewechselt-{lang}-clm`) with a MetaICL _vessel_ adapter, trained with targeted distillation.                                                                            |

## Usage

We use [hydra](https://hydra.cc/) for configuring our project.

To download/process the data, either run

[claficle/data/oscar.py](claficle/data/oscar.py) or

[claficle/data/benchmark.py](claficle/data/benchmark.py) for OSCAR and our

multi-lingual multi-task benchmark respectively. You may have to configure or

override [claficle/conf/setup_data.yaml](claficle/conf/setup_data.yaml)

accordingly. We suggest inspecting [slurm/data/](slurm/data/) for examples of

how we ran these.

Note that to process OSCAR in French and German data you need to make use of

trained tokenizers from WECHSEL initialization. You can either download these

along with our checkpoints or run WECHSEL initalization yourself by running

[claficle/models/gewechselt.py](claficle/models/gewechselt.py), configured with

[claficle/conf/wechsel_init.yaml](claficle/conf/wechsel_init.yaml). We have

examples of how we ran this in [slurm/wechsel/](slurm/wechsel_init/).

Once the data is downloaded, to run evaluation run

[claficle/run/eval.py](claficle/run/eval.py), configured with

[claficle/conf/eval.yaml](claficle/conf/eval.yaml). Examples at

[slurm/eval/](slurm/eval/).

Of course, to run evaluation you need trained checkpoints. You can once again

either download these or train them yourself. For geWECHSELt models, you can

run [claficle/run/train.py](claficle/run/train.py). For MetaICLVA, you can run

[claficle/run/distil.py](claficle/run/distil.py). For MetaICLA, please refer to

[our MetaICL fork](https://github.com/thesofakillers/metaICLA). Like always,

these are configured with the relevant files in

[claficle/conf/](claficle/conf/) and are accompanies by examples of how we did

it in [slurm/](slurm/).

## Project Organization

```plaintext

    ├── LICENSE

    ├── README.md          <- The top-level README

    ├── data/

    │   ├── interim/       <- Intermediate data that has been transformed.

    │   ├── processed/     <- The final, canonical data sets for modeling.

    │   └── raw/           <- The original, immutable data dump.

    ├── checkpoints/       <- Trained and serialized models.

    ├── notebooks/         <- Jupyter notebooks.

    ├── slurm/             <- SLURM scripts

    ├── logs/              <- logs

    ├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.

    ├── pyproject.toml     <- project metadata, handled by poetry.

    ├── poetry.lock        <- resolving and locking dependencies, handled by poetry.

    ├── requirements.txt   <- for non-poetry users.

    ├── gen_pip_reqs.sh    <- for generating the pip requirements.txt file

    └── claficle/          <- Source code for use in this project.

        ├── __init__.py    <- Makes src a Python module

        ├── data/          <- Scripts to download or generate data

        ├── models/        <- Model definitions

        ├── run/           <- scripts to train, evaluate and use models

        ├── conf/          <- config files

        ├── utils/         <- miscellaneous utils

        └── visualization/ <- Scripts for visualization

```

The project structure is largely based on the

[cookiecutter data-science template](https://github.com/drivendata/cookiecutter-data-science).

This is purposely opinionated so that paths align over collaborators without

having to edit config files. Users may find the

[cookiecutter data-science opinions page](http://drivendata.github.io/cookiecutter-data-science/#opinions),

of relevance

The top level `data/` and `models/` directory are in version control only to

show structure. Their contents will not be committed and are ignored via

`.gitignore`.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/thesofakillers/claficle

Awesome Lists containing this project

README