https://github.com/thudm/efficient-head-finetuning

Source code for EMNLP2022 long paper: Parameter-Efficient Tuning Makes a Good Classification Head
https://github.com/thudm/efficient-head-finetuning

finetuning language-model

Last synced: 8 months ago
JSON representation

Source code for EMNLP2022 long paper: Parameter-Efficient Tuning Makes a Good Classification Head

Host: GitHub
URL: https://github.com/thudm/efficient-head-finetuning
Owner: THUDM
License: apache-2.0
Created: 2022-10-21T12:52:24.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-11-07T05:54:42.000Z (over 3 years ago)
Last Synced: 2025-03-24T13:11:16.240Z (about 1 year ago)
Topics: finetuning, language-model
Language: Python
Homepage:
Size: 2.1 MB
Stars: 14
Watchers: 13
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Parameter-Efficient-Tuning-Makes-a-Good-Classification-Head

Source code for EMNLP2022 long paper: Parameter-Efficient Tuning Makes a Good Classification Head

[arxiv](https://arxiv.org/abs/2210.16771)

We found that 

> 1. Finetune the pretrained LM with a parameter-efficient algorithm.

> 2. Finetune the pretrained LM with initializing the classification head as the weight from 1.

usually better than direct finetuning.

**We implement our methods base on a open source libary [SwissArmyTransformers](https://github.com/THUDM/SwissArmyTransformer).**

**Step 1.** 

Download checkpoint of [RoBERTa-Large](https://cloud.tsinghua.edu.cn/f/66c42c24ca304cecaf7e/?dl=1) or [BERT-Large](https://cloud.tsinghua.edu.cn/f/6d4f38c96e8c4c16917e/?dl=1) (Provided by SwissArmyTransformer) and decompress.

**Step 2.**

Add checkpoint dir path to line 5 in EH-FT/roberta/scripts/finetune.sh

 **Step3.**

```

cd EH-FT/roberta

python scripts/run_multiseed.py --number-gpu 1 --gpu-s 0 --seed-per-gpu 1 --dataset rte --finetune-type 2step+bitfit

```

 **Step4.**

```

cd EH-FT/roberta

python scripts/run_multiseed.py --number-gpu 1 --gpu-s 0 --seed-per-gpu 1 --dataset rte --finetune-type 2step+bitfit

```

The script will launch [number-gpu] processes with gpu [gpu-s], gpu [gpu-s+1], ..., gpu [gpu-s + number-gpu - 1]. Each process has a different random seed. 

**You can change dataset and finetune-type.**

Dataset: rte, mrpc, boolq, wic, cb, copa, wsc, qnli, stsb

| Finetune-type | name in paper             |

| ------------- | ------------------------- |

| all           | traditional finetuning    |

| 2step+head    | LP-FT                     |

| 2step+bitfit  | EH-FT(BitFit)             |

| 2step+lora    | EH-FT(LoRA)               |

| 2step+pt      | EH-FT(PT)                 |

| bitft/lora/pt | BitFit/LoRA/Prefix tuning |

| head          | Linear Probing            |

| child         | child-tuning              |

| mixout        | Mixout                    |

**Step4.**

See results in runs/ using tensorboard.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/thudm/efficient-head-finetuning

Awesome Lists containing this project

README