https://github.com/om-ai-lab/vl-checklist

Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
https://github.com/om-ai-lab/vl-checklist

evaluation-metrics multimodal-deep-learning vision-and-language

Last synced: 6 months ago
JSON representation

Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]

Host: GitHub
URL: https://github.com/om-ai-lab/vl-checklist
Owner: om-ai-lab
Created: 2022-07-03T08:34:42.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-09-29T07:50:40.000Z (about 1 year ago)
Last Synced: 2025-03-29T12:09:25.744Z (6 months ago)
Topics: evaluation-metrics, multimodal-deep-learning, vision-and-language
Language: Python
Homepage:
Size: 26.6 MB
Stars: 126
Watchers: 5
Forks: 5
Open Issues: 8
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # VL-CheckList 

 

## Updates

- 07/04/2022: VL-CheckList paper on arxiv https://arxiv.org/abs/2207.00221

- 07/12/2022: Updated object, relation, attribute splits/dataset

- 08/01/2022: Release the initial code and example models

## Introduction

This repository is the official project page for (VL-CheckList). 

**VL-CheckList** is an explainable framework that comprehensively evaluates VLP models and facilitates deeper understanding.The current method to evaluate a VLP model is solely by comparing its fine-tuned downstream tasks performance, which has a number of limitations, such as poor interpretability, incomparable results and bias in data.

The core principle of VL-CheckList are: (1) evaluate a VLP model's fundamental capabilities instead of performance on applications (2) disentangle capabilities into relatively independent variables that are easier to analyze.

VL-CheckList evaluates VLP models from three aspects: Object, Attribute and Relationship. We provide the performance quantitative table and the radar chart based on the three aspects.

## How to Install VL-CheckList

You can install vl_checklist in your project and import vl_checklist and evaluate your models:

```

pip install vl_checklist

```

`vilt_test.py` is an example code to show how to import vl_checklist in your project. 

You need to copy `data/` and `corpus/` folders to the root of your project and prepare image datasets [Link](DATASETS.md). 

You can also clone this project add your model as follows. 

```

git clone https://github.com/om-ai-lab/VL-CheckList.git

```

## Detailed Guidelines How to Evaluate your Model

We include several representative example VLP models in the `example_models/` folder. 

**1.** Define a config file 

e.g. in [configs/sample.yaml](./configs/sample.yaml)

```

MAX_NUM: 2000

MODEL_NAME: "ViLT"

BATCH_SIZE: 4

TASK: "itc"

DATA:

  TYPES: ["Attribute/color"]

  TEST_DATA: ["vg","vaw"]   

OUTPUT: 

  DIR: "output/vilt"

```

**2.** Prepare Evaluation Data

We provide the initial curated jsons at `data/` and corresponding yamls at `vl_checklist/corpus`. You can need to download image dataset. You can find the instruction in detail [Link](DATASETS.md)

**3.** Load the model which contain `predict()` and Evaluate class as follows.  Please find an example model class [Link](./example_models/vilt/engine.py)

**4.** Run `start()` as follows

Here is an example code

```python

from example_models.vilt.engine import ViLT

from vl_checklist.evaluate import Evaluate

if __name__ == '__main__':

    model = ViLT('vilt_200k_mlm_itm.ckpt')

    vilt_eval = Evaluate(config="configs/sample.yaml", model=model)

    vilt_eval.start()

```    

**5.**  check the results in the OUTDIR DIR you defined the yaml file

You can check the output format [LINK](OUTPUT.md)

## Download Pretrained Weights

We include examples models at `example_models/`. You can download the pretrained weights at `resources/` folder to test our example models:

- [ViLT-B/32](https://github.com/dandelin/ViLT/releases/download/200k/vilt_200k_mlm_itm.ckpt)

- [ALBEF](https://storage.googleapis.com/sfr-pcl-data-research/ALBEF/ALBEF.pth)

- [TCL](https://drive.google.com/file/d/1Cb1azBdcdbm0pRMFs-tupKxILTCXlB4O/view)

- [UNITER](https://github.com/ChenRocks/UNITER)

- [OSCAR](https://biglmdiag.blob.core.windows.net/vinvl/model_ckpts/image_captioning/pretrained_base.zip)

- [LXMERT](https://drive.google.com/drive/folders/1Gq1uLUk6NdD0CcJOptXjxE6ssY5XAuat?usp=sharing)

- [CLIP](https://github.com/openai/CLIP)

## Demo

We present the demo in huggingface space, you can try it here: [Demo link](https://huggingface.co/spaces/omlab/VL_checklist_demo)  

In this demo, you can change the object and attribute of object in the text prompt. You can also change the size and location of the object.

## References

If you use any source codes or datasets included in this toolkit in your work, please cite the following paper. The bibtex are listed below:

```

@misc{https://doi.org/10.48550/arxiv.2207.00221,

  doi = {10.48550/ARXIV.2207.00221}, 

  url = {https://arxiv.org/abs/2207.00221},

  author = {Zhao, Tiancheng and Zhang, Tianqi and Zhu, Mingwei and Shen, Haozhan and Lee, Kyusong and Lu, Xiaopeng and Yin, Jianwei},

  keywords = {Computer Vision and Pattern Recognition (cs.CV), Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},

  title = {VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations},

  publisher = {arXiv},

  year = {2022},

  copyright = {Creative Commons Attribution 4.0 International}

}

```

```

@inproceedings{zhao2022explainable,

  title={An explainable toolbox for evaluating pre-trained vision-language models},

  author={Zhao, Tiancheng and Zhang, Tianqi and Zhu, Mingwei and Shen, Haozhan and Lee, Kyusong and Lu, Xiaopeng and Yin, Jianwei},

  booktitle={Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},

  pages={30--37},

  year={2022}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/om-ai-lab/vl-checklist

Awesome Lists containing this project

README