{"id":13489554,"url":"https://github.com/OODRobustBench/OODRobustBench","last_synced_at":"2025-03-28T05:31:03.468Z","repository":{"id":239728594,"uuid":"798290602","full_name":"OODRobustBench/OODRobustBench","owner":"OODRobustBench","description":"OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift. ICML 2024 and ICLRW-DMLR 2024","archived":false,"fork":false,"pushed_at":"2024-07-25T06:38:09.000Z","size":2163,"stargazers_count":18,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-31T03:33:38.220Z","etag":null,"topics":["adversarial-examples","adversarial-machine-learning","out-of-distribution","robustness"],"latest_commit_sha":null,"homepage":"https://oodrobustbench.github.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OODRobustBench.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-09T13:36:53.000Z","updated_at":"2024-10-18T10:48:28.000Z","dependencies_parsed_at":"2024-06-02T03:49:24.524Z","dependency_job_id":"5124f81e-4147-4b27-8ba2-97d6cbc81f66","html_url":"https://github.com/OODRobustBench/OODRobustBench","commit_stats":null,"previous_names":["oodrobustbench/oodrobustbench"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OODRobustBench%2FOODRobustBench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OODRobustBench%2FOODRobustBench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OODRobustBench%2FOODRobustBench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OODRobustBench%2FOODRobustBench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OODRobustBench","download_url":"https://codeload.github.com/OODRobustBench/OODRobustBench/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245978200,"owners_count":20703675,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adversarial-examples","adversarial-machine-learning","out-of-distribution","robustness"],"created_at":"2024-07-31T19:00:30.789Z","updated_at":"2025-03-28T05:31:02.221Z","avatar_url":"https://github.com/OODRobustBench.png","language":"Python","funding_links":[],"categories":["Benchmarks","Python"],"sub_categories":[],"readme":"# OODRobustBench: Adversarial Robustness under Distribution Shift\n**Lin Li (KCL), Yifei Wang (MIT), Chawin Sitawarin (UC Berkeley), Michael Spratling (KCL)**\n\nThis is the official code of the paper \"OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift\". This work has been accepted by the main conference of ICML 2024 and the workshop Data-centric Machine Learning Research (DMLR) of ICLR 2024.\n\nThe leaderboard: https://oodrobustbench.github.io/\n\nPaper: https://arxiv.org/abs/2310.12793\n\n## 1. High-level idea and design\n\nExisting works have made great progress in improving adversarial robustness, but typically test their method only on data from the same distribution as the training data, i.e. in-distribution (ID) testing. As a result, it is unclear how such robustness generalizes under input distribution shifts, i.e. out-of-distribution (OOD) testing. This is a concerning omission as such distribution shifts are unavoidable when methods are deployed in the wild. To address this issue we propose a benchmark named OODRobustBench to comprehensively assess OOD adversarial robustness using 23 dataset-wise shifts (i.e. naturalistic shifts in input distribution) and 6 threat-wise shifts (i.e., unforeseen adversarial threat models).\n\n![](assets/benchmark_construction.png)\n\nThe code of OODRobustBench is built on top of [RobustBench](https://github.com/RobustBench/robustbench) to allow a unified, RobustBench-like, interface of evaluation and loading models and support loading (latest) models from RobustBench, in other words, you know how to use RobustBench then you know how to use OODRobustBench. Nevertheless, if you have not used RobustBench before, no worry! We have provided a detailed and easy-to-follow guide below for preparation and usage. \n\n## 2. Preparation\n\n### 2.1. Installation\n\nFirst of all, **Python 3.8 is strongly recommended** because there is a conflict of dependencies between [robustbench](https://github.com/RobustBench/robustbench) and [perceptual-advex](https://github.com/cassidylaidlaw/perceptual-advex) when a higher version like Python 3.9 is used. \n\n#### As a repository\n\nclone the project and run the following command to install required packages:\n\n```bash\npip install -r requirements.txt\n```\n\n#### As a package\n\n```bash\npip install git+https:github.com:OODRobustBench/OODRobustBench.git\n```\n\nThis enables importing the package as follows:\n\n```python\nfrom oodrobustbench.eval import benchmark\n```\n\n#### Resolve a Python compatibility issue\n\nUnfortunately, the latest version of [robustbench](https://github.com/RobustBench/robustbench) has an [issue](https://github.com/RobustBench/robustbench/blob/master/robustbench/model_zoo/architectures/robustarch_wide_resnet.py) with loading models in Python 3.8. To run the code, follow these steps:\n\n1. Locate the file `robustbench/model_zoo/architectures/robustarch_wide_resnet.py` in your Python environment. This location is provided in the error information when you attempt to import `oodrobustbench`.\n2. Open the file in your preferred text editor.\n3. Uncomment line 10 to import `List` from `typing`. You should see the original warning here.\n4. Replace all instances of `list[]` with `List[]`. This can be quickly done by replacing `list[` with `List[`.\n\nI will monitor updates to robustbench and remove this section once the issue is resolved.\n\n### 2.2. Data\n\nWe suggest to put all datasets under the same directory (say $DATA) to ease data management and avoid modifying the source code of data loading. An overview of the file structure is shown below\n\n```bash\n$DATA/\n|–– cifar-10-batches-py/\n|–– cifar-10.1/\n|–– cifar-10.2/\n|–– cifar-10-r/\n|–– CIFAR-10-C/\n|–– CINIC-10/\n|–– imagenet/\n|–– imagenetv2-matched-frequency-format-val/\n|–– imagenet-a/\n|–– imagenet-r/\n|–– objectnet/\n|–– ImageNet-C/\n|–– ImageNet-V/\n|–– ImageNet-Sketch/\n```\n\nThe above datasets are divided into two groups:\n\n1. Automatic download: CIFAR10, CIFAR10-C\n2. Manual download: [CIFAR10.1](https://github.com/modestyachts/CIFAR-10.1/tree/master/datasets), [CIFAR10.2](https://github.com/modestyachts/cifar-10.2), [CIFAR10-R](https://github.com/TreeLLi/cifar10-r), [CINIC-10](https://datashare.ed.ac.uk/handle/10283/3192), [ImageNet](https://image-net.org/download.php), [ImageNet-v2](https://huggingface.co/datasets/vaishaal/ImageNetV2/tree/main), [ImageNet-A](https://people.eecs.berkeley.edu/~hendrycks/imagenet-a.tar), [ImageNet-R](https://people.eecs.berkeley.edu/~hendrycks/imagenet-r.tar), [ImageNet-C](https://github.com/hendrycks/robustness#imagenet-c), [ObjectNet](https://www.kaggle.com/datasets/treelinli/objectnet-imagenet-overlap), [ImageNet-V](https://github.com/Heathcliff-saku/ViewFool_/tree/master), [ImageNet-Sketch](https://github.com/HaohanWang/ImageNet-Sketch)\n\nDatasets in Group 1 will be downloaded automatically when used. Datasets in Group 2 need to be downloaded from the given links by clicking the dataset name and structured as specified above. A guide on how to prepare ImageNet can be found [here](https://github.com/soumith/imagenet-multiGPU.torch#data-processing).\n\nNote that the folder names should be followed strictly unless modifying our original source code. We suggest to use [soft link](https://en.wikipedia.org/wiki/Symbolic_link) to reuse the datasets that you have already had before by linking them to $DATA. \n\n## 3. Model Zoo\n\nOur model zoo consists of three groups: \n\n1. *RobustBench models* are loaded by [robustbench](https://github.com/RobustBench/robustbench) API\n2. *Public collected models* are manually collected from public adversarial ML works (not covered by RobustBench by the release time of this work). \n3. *Custom models* were trained by ourselves, see [here](#5.-Custom-Models) for details.\n\nNoticeably, thanks to the deep integration with RobustBench, **OODRobustBench can seamlessly load the latest submissions to RobustBench** by simply upgrading the RobustBench package in the dev environment.\n\n### 3.1. Automatic model loading\n\nTo load a model, we provide a unified API similar to RobustBench:\n\n```python\nfrom oodrobustbench.utils import load_model\n# an example of loading a model named Wong2020Fast for CIFAR10 Linf\nmodel = load_model(model_name='Wong2020Fast', \n                   model_dir='/root/to/models',\n                   dataset='cifar10',\n                   threat_model='Linf')\n```\n\nReplace `/root/to/models` with the real directory for `model_dir`, by default, `models` under the working directory will be used. \n\nThe weights of models of RobustBench and *custom models* will be automatically downloaded and placed in the specified directory if needed, while the weights of *public collected* models currently need to be manually downloaded (currently please email Lin Li for links) and placed in the appropriate directory. \n\nRegarding model names, please refer to [robustbench](https://github.com/RobustBench/robustbench) for RobustBench models, [this section](#5.-Custom-Models) for custom models, and [this source code file](https://github.com/OODRobustBench/OODRobustBench/blob/main/oodrobustbench/models/__init__.py) for public collected models.\n\n### 3.2. Contribution: submit your model to the leaderboard and / or model zoo \n\nThanks for your interest! To submit your model to the leaderboard, you need to first enable the automatic loading of your model as following:\n\n1. (Optional) add your custom model architecture file under `/oodrobustbench/models`\n3. Modify `/oodrobustbench/models/__init__.py` to add a callable constructor of your model arch\n4. Modify `load_model()` in `/oodrobustbench/utils.py` to load trained weights to your model\n\nPlease refer to [this doc](https://github.com/RobustBench/robustbench/tree/master?tab=readme-ov-file#model-definition) for some guidance, otherwise, contact the author if further clarification required. \n\nAfter successfully adding your model, say it can be automatically loaded by the code, you need to email the author (`linli.tree@outlook.com`) with the above modified files and your model weights for a submission to the leaderboard. \n\n## 4. Evaluation\n\nWe describe below the template commands we used to get the results reported in our paper. The output results are saved under the directory `model_info/$DATASET/$THREAT_MODEL`. The program automatically saves the result of each shift evaluation and load the results from the saved file if have so no need to worry about the evaluation being interrupted.\n\n### 4.1. Dataset shift\n\nFor CIFAR10 $\\ell_\\infty$ models under all corruptions and natural shifts with 10k samples:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --threat-model Linf --adv-norm Linf -a mm5 --corruption-models corruptions --natural-shifts all -n 10000 --model_name $MODEL_NAME\n```\n\nPlease refer to the code of `oodrobustbench/eval.py` or running the command `python -m oodrobustbench.eval -h` for the explanation and the candidate values of each argument. \n\nFor CIFAR10 $\\ell_2$ models:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --threat-model L2 --adv-norm L2 -a mm5 --corruption-models corruptions --natural-shifts all -n 10000 --model_name $MODEL_NAME\n```\n\nFor ImageNet $\\ell_\\infty$ models:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --dataset imagenet --threat-model Linf --adv-norm Linf -a mm5 --eps 0.01568627 --corruption-models corruptions --natural-shifts all -n 5000 --model_name $MODEL_NAME\n```\n\n### 4.2 Threat shift\n\nFor CIFAR10 $\\ell_\\infty$ models against LPA threat shift:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --threat-model Linf -a lpa --eps 0.5 -n 10000 --model_name $MODEL_NAME\n```\n\nFor CIFAR10 $\\ell_\\infty$ models against PPGD threat shift:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --threat-model Linf -a ppgd --eps 0.5 -n 10000 --model_name $MODEL_NAME\n```\n\nFor CIFAR10 $\\ell_\\infty$ models against ReColor threat shift:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --threat-model Linf -a stadv --eps 0.05 -n 10000 --model_name $MODEL_NAME\n```\n\nFor CIFAR10 $\\ell_\\infty$ models against StAdv threat shift:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --threat-model Linf -a recolor --eps 0.06 -n 10000 --model_name $MODEL_NAME\n```\n\nFor CIFAR10 $\\ell_\\infty$ models against different $p$-norm threat shift:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --threat-model Linf --adv-norm L2 -a mm5 --eps 0.5 -n 10000 --model_name $MODEL_NAME\n```\n\nFor CIFAR10 $\\ell_\\infty$ models against different $\\epsilon$ threat shift:\n\n```bash\npython -m oodrobustbench.eval --data_dir $DATA --threat-model Linf --adv-norm Linf -a mm5 --eps 0.0470588 -n 10000 --model_name $MODEL_NAME\n```\n\nPlease refer to our paper for the configuration of threat shifts for the settings other than CIFAR10 $\\ell_\\infty$.\n\n### 4.3 Evaluate your own model\n\nTo evaluate your own models, you can use `benchmark()`  as exemplified below:\n\n```python\nfrom oodrobustbench.eval import benchmark\nmodel = initialize your own model\nmodel.eval()\nid_acc, id_rob, ood_acc_robs = benchmark(model,\n                                         n_examples=10000,\n                                         dataset='cifar10',\n                                         attack='mm5',\n                                         threat_model='Linf',\n                                         adv_norm='Linf',\n                                         natural_shifts='all',\n                                         corruption_models='corruptions',\n                                         corruptions=None,\n                                         severities=[1,2,3,4,5],\n                                         to_disk=True,\n                                         model_name=$MODEL_NAME,\n                                         data_dir=$DATA,\n                                         device='cuda',\n                                         batch_size=100,\n                                         eps=8/255)\n```\n\nThis is actually what happens when you run `python -m oodrobustbench.eval`. Note that the results will be also saved on the disk when calling `benchmark()` to evaluate.\n\n## 5. Custom Models\n\nThe following instructions explain how to load custom models that we have trained ourselves. Model name is a bit long and starts with `custom_`. It contains the hyperparameter choices. For example, `custom_convmixer_trades_trades_seed0_bs512_lr0.1_wd0.0001_sgd_50ep_eps0.5_beta0.1`,\n\nThe weights are hosted on Zenodo and is downloaded automatically when a model is called. Download speed from Zenodo server can be poor sometimes so if you know you want to use all the models, you can download all the weights at once with `zenodo_get` and put them at the right location:\n\n```bash\npip install zenodo_get\ncd $MODEL_PATH  # mkdir if needed\nzenodo_get $DEPOSIT_ID\n```\n\nwhere `$DEPOSIT_ID` and `$MODEL_PATH` are the Zenodo deposit ID and the associated model path. Each deposit has a maximum size of 40 GB and contains a group of models denoted by the path. See the list of `$DEPOSIT_ID: $MODEL_PATH` below:\n\n* `8285099`: `cifar10/L2`.\n\nPlease install the extra packages in `requirements.txt`. See `oodar.models.custom_models.utils._MODEL_DATA` for the list of available models.\n\n## 6. Citation\n\n```\n@inproceedings{li2024oodrobustbench,\n    title={OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift},\n    author={Lin Li, Yifei Wang, Chawin Sitawarin, Michael Spratling},\n    booktitle={International Conference on Machine Learning},\n    year={2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOODRobustBench%2FOODRobustBench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FOODRobustBench%2FOODRobustBench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOODRobustBench%2FOODRobustBench/lists"}