Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/facebookresearch/vilbert-multi-task
Multi Task Vision and Language
https://github.com/facebookresearch/vilbert-multi-task
Last synced: about 2 months ago
JSON representation
Multi Task Vision and Language
- Host: GitHub
- URL: https://github.com/facebookresearch/vilbert-multi-task
- Owner: facebookresearch
- License: mit
- Archived: true
- Created: 2019-11-25T22:37:44.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2022-02-16T04:47:38.000Z (almost 3 years ago)
- Last Synced: 2024-08-08T13:13:04.306Z (5 months ago)
- Language: Jupyter Notebook
- Size: 114 MB
- Stars: 800
- Watchers: 19
- Forks: 180
- Open Issues: 68
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- StarryDivineSky - facebookresearch/vilbert-multi-task
README
# 12-in-1: Multi-Task Vision and Language Representation Learning
Please cite the following if you use this code. Code and pre-trained models for [12-in-1: Multi-Task Vision and Language Representation Learning](http://openaccess.thecvf.com/content_CVPR_2020/html/Lu_12-in-1_Multi-Task_Vision_and_Language_Representation_Learning_CVPR_2020_paper.html):
```
@InProceedings{Lu_2020_CVPR,
author = {Lu, Jiasen and Goswami, Vedanuj and Rohrbach, Marcus and Parikh, Devi and Lee, Stefan},
title = {12-in-1: Multi-Task Vision and Language Representation Learning},
booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
```and [ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks](https://arxiv.org/abs/1908.02265):
```
@inproceedings{lu2019vilbert,
title={Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks},
author={Lu, Jiasen and Batra, Dhruv and Parikh, Devi and Lee, Stefan},
booktitle={Advances in Neural Information Processing Systems},
pages={13--23},
year={2019}
}
```## Repository Setup
1. Create a fresh conda environment, and install all dependencies.
```text
conda create -n vilbert-mt python=3.6
conda activate vilbert-mt
git clone --recursive https://github.com/facebookresearch/vilbert-multi-task.git
cd vilbert-multi-task
pip install -r requirements.txt
```2. Install pytorch
```
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
```3. Install apex, follows https://github.com/NVIDIA/apex
4. Install this codebase as a package in this environment.
```text
python setup.py develop
```## Data Setup
Check `README.md` under `data` for more details.
## Visiolinguistic Pre-training and Multi Task Training
### Pretraining on Conceptual Captions
```
python train_concap.py --bert_model bert-base-uncased --config_file config/bert_base_6layer_6conect.json --train_batch_size 512 --objective 1 --file_path
```
[Download link](https://dl.fbaipublicfiles.com/vilbert-multi-task/pretrained_model.bin)### Multi-task Training
```
python train_tasks.py --bert_model bert-base-uncased --from_pretrained --config_file config/bert_base_6layer_6conect.json --tasks 1-2-4-7-8-9-10-11-12-13-15-17 --lr_scheduler 'warmup_linear' --train_iter_gap 4 --task_specific_tokens --save_name multi_task_model
```[Download link](https://dl.fbaipublicfiles.com/vilbert-multi-task/multi_task_model.bin)
### Fine-tune from Multi-task trained model
```
python train_tasks.py --bert_model bert-base-uncased --from_pretrained --config_file config/bert_base_6layer_6conect.json --tasks 1 --lr_scheduler 'warmup_linear' --train_iter_gap 4 --task_specific_tokens --save_name finetune_from_multi_task_model
```
## Licensevilbert-multi-task is licensed under MIT license available in [LICENSE](LICENSE) file.