Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hila-chefer/transformer-mm-explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
https://github.com/hila-chefer/transformer-mm-explainability
clip detr explainability explainable-ai interpretability lxmert transformer transformers visualbert visualization vqa
Last synced: 3 days ago
JSON representation
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
- Host: GitHub
- URL: https://github.com/hila-chefer/transformer-mm-explainability
- Owner: hila-chefer
- License: mit
- Created: 2021-03-23T22:11:18.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2023-08-24T17:45:14.000Z (over 1 year ago)
- Last Synced: 2025-01-22T18:04:57.659Z (10 days ago)
- Topics: clip, detr, explainability, explainable-ai, interpretability, lxmert, transformer, transformers, visualbert, visualization, vqa
- Language: Jupyter Notebook
- Homepage:
- Size: 25.3 MB
- Stars: 819
- Watchers: 8
- Forks: 107
- Open Issues: 12
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
[ICCV 2021- Oral] PyTorch Implementation of `Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers `_
=================================================================================================================================================================|youtube|
.. |youtube| image:: https://img.shields.io/static/v1?label=ICCV2021&message=12MinuteVideo&color=red
:target: https://www.youtube.com/watch?v=bQTL34Dln-MNotebooks for LXMERT + DETR:
----------------------------|DETR_LXMERT|
.. |DETR_LXMERT| image:: https://colab.research.google.com/assets/colab-badge.svg
:target: https://colab.research.google.com/github/hila-chefer/Transformer-MM-Explainability/blob/main/Transformer_MM_Explainability.ipynbNotebook for CLIP:
----------------------------|CLIP|
.. |CLIP| image:: https://colab.research.google.com/assets/colab-badge.svg
:target: https://colab.research.google.com/github/hila-chefer/Transformer-MM-Explainability/blob/main/CLIP_explainability.ipynb**Demo**: You can check out a demo on `Huggingface spaces `_ or scan the following QR code.
.. image:: https://user-images.githubusercontent.com/19412343/176676771-d26f2146-9901-49e7-99be-b030f3d790de.png
:width: 100Notebook for ViT:
----------------------------|ViT|
.. |ViT| image:: https://colab.research.google.com/assets/colab-badge.svg
:target: https://colab.research.google.com/github/hila-chefer/Transformer-MM-Explainability/blob/main/Transformer_MM_explainability_ViT.ipynb.. sectnum::
Using Colab
----------------* Please notice that the notebook assumes that you are using a GPU. To switch runtime go to Runtime -> change runtime type and select GPU.
* Installing all the requirements may take some time. After installation, please restart the runtime.Running Examples
----------------Notice that we have two `jupyter` notebooks to run the examples presented in the paper.
* `The notebook for LXMERT <./LXMERT.ipynb>`_ contains both the examples from the paper and examples with images from the internet and free form questions.
To use your own input, simply change the `URL` variable to your image and the `question` variable to your free form question... image:: LXMERT.PNG
.. image:: LXMERT-web.PNG
* `The notebook for DETR <./DETR.ipynb>`_ contains the examples from the paper.
To use your own input, simply change the `URL` variable to your image... image:: DETR.PNG
Reproduction of results
-----------------------^^^^^^^^^^
VisualBERT
^^^^^^^^^^Run the `run.py` script as follows:
.. code-block:: bash
CUDA_VISIBLE_DEVICES=0 PYTHONPATH=`pwd` python VisualBERT/run.py --method= --is-text-pert= --is-positive-pert= --num-samples=10000 config=projects/visual_bert/configs/vqa2/defaults.yaml model=visual_bert dataset=vqa2 run_type=val checkpoint.resume_zoo=visual_bert.finetuned.vqa2.from_coco_train env.data_dir=/path/to/data_dir training.num_workers=0 training.batch_size=1 training.trainer=mmf_pert training.seed=1234
.. note::
If the datasets aren't already in `env.data_dir`, then the script will download the data automatically to the path in `env.data_dir`.
^^^^^^
LXMERT
^^^^^^#. Download `valid.json `_:
.. code-block:: bash
pushd data/vqa
wget https://nlp.cs.unc.edu/data/lxmert_data/vqa/valid.json
popd#. Download the `COCO_val2014` set to your local machine.
.. note::
If you already downloaded `COCO_val2014` for the `VisualBERT`_ tests, you can simply use the same path you used for `VisualBERT`_.
#. Run the `perturbation.py` script as follows:
.. code-block:: bash
CUDA_VISIBLE_DEVICES=0 PYTHONPATH=`pwd` python lxmert/lxmert/perturbation.py --COCO_path /path/to/COCO_val2014 --method --is-text-pert --is-positive-pert
^^^^
DETR
^^^^#. Download the COCO dataset as described in the `DETR repository `_.
Notice you only need the validation set.
#. Lower the IoU minimum threshold from 0.5 to 0.2 using the following steps:
* Locate the `cocoeval.py` script in your python library path:
find library path:
.. code-block:: pythonimport sys
print(sys.path)
find `cocoeval.py`:
.. code-block:: bash
cd /path/to/lib
find -name cocoeval.py
* Change the `self.iouThrs` value in the `setDetParams` function (which sets the parameters for the COCO detection evaluation) in the `Params` class as follows:
insead of:
.. code-block:: pythonself.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
use:
.. code-block:: pythonself.iouThrs = np.linspace(.2, 0.95, int(np.round((0.95 - .2) / .05)) + 1, endpoint=True)
#. Run the segmentation experiment, use the following command:
.. code-block:: bash
CUDA_VISIBLE_DEVICES=0 PYTHONPATH=`pwd` python DETR/main.py --coco_path /path/to/coco/dataset --eval --masks --resume https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth --batch_size 1 --method
Citing
-------If you make use of our work, please cite our paper:
.. code-block:: latex
@InProceedings{Chefer_2021_ICCV,
author = {Chefer, Hila and Gur, Shir and Wolf, Lior},
title = {Generic Attention-Model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {397-406}
}Credits
-------* VisualBERT implementation is based on the `MMF `_ framework.
* LXMERT implementation is based on the `offical LXMERT `_ implementation and on `Hugging Face Transformers `_.
* DETR implementation is based on the `offical DETR `_ implementation.
* CLIP implementation is based on the `offical CLIP `_ implementation.
* The CLIP huggingface spaces demo was made by Paul Hilders, Danilo de Goede, and Piyush Bagad from the University of Amsterdam as part of their `final project `_.