https://github.com/paarthneekhara/multimodal_rerprogramming
Multimodal adversarial rerprogramming
https://github.com/paarthneekhara/multimodal_rerprogramming
Last synced: 3 months ago
JSON representation
Multimodal adversarial rerprogramming
- Host: GitHub
- URL: https://github.com/paarthneekhara/multimodal_rerprogramming
- Owner: paarthneekhara
- Created: 2020-10-31T15:51:31.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2021-10-12T19:25:35.000Z (almost 4 years ago)
- Last Synced: 2025-04-30T09:01:59.651Z (5 months ago)
- Language: Jupyter Notebook
- Size: 1.77 MB
- Stars: 11
- Watchers: 4
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Cross-modal Adversarial Reprogramming
Code for our WACV 2022 paper [Cross-modal Adversarial Reprogramming](https://arxiv.org/abs/2102.07325).
## Installation
* Clone the repo: ``git clone multimodal_rerprogramming``
* Get the code for image models:
1) ``git submodule init``
2) ``git submodule update``
* Install TIMM (PyTorch image models):
1) ``cd pytorch-image-models``
2) ``pip install requirements.txt``
3) ``pip install -e .`` (this installs library in editable mode)
* Install transformers and datasets library from huggingface and tensorboardX:
1) ``pip install datasets``
2) ``pip install transformers``
3) ``pip install tensorboardX``
3) ``pip install sklearn==0.23.2``
## Running the Experiments
The text/sequence dataset configurations are defined in ``data_utils.py``. We can either use text-classification [datasets available in the huggingface hub](https://huggingface.co/docs/datasets/) or use our custom datasets (defined as json files) with the same API. To reprogram an image model for a text classification task run:
CUDA_VISIBLE_DEVICES=0 python reprogramming.py --text_dataset TEXTDATSET --logdir --cache_dir --reg_alpha 1e-4 --pretrained_vm 1 --resume_training 1 --use_char_tokenizer 0 --img_patch_size 16 --vision_model tf_efficientnet_b4;
* TEXTDATSET is one of the dataset keys defined in data_utils.py
* Pretrained VM 1 or 0 depending on pretrained or random network
* Set use_char_tokenizer to 1 if you using DNA datasets - protein_splice, geneh3
* exp_name_extension name of the experiment that uniquely identifies your run
* Set --resume_training to 1 if you want to continue a run that stopped.
* img_patch_size: Image patch size to embed each sequence token into
* For bounded experiments: Supply --base_image_path library.jpg --max_iterations 150000 (or any other image)
* vision_model is one of the following [vit_base_patch16_384, tf_efficientnet_b4, resnet50, tf_efficientnet_b7, inception_v3] : some configurations for these models are defined in data_utils.py
Once the model is trained, you may use the ``InferenceNotebook.ipynb`` notebook to visualize the reprogrammed images etc. Accuracy and other metrics on the test set are logged in tensorboard during training.## Citing our work
```
@inproceedings{neekhara2022crossmodal,
title={Cross-modal Adversarial Reprogramming},
author={Neekhara, Paarth and Hussain, Shehzeen and Du, Jinglong and Dubnov, Shlomo and Koushanfar, Farinaz and McAuley, Julian },
booktitle={WACV},
year={2022}
}
```