{"id":19467703,"url":"https://github.com/thunlp/advbench","last_synced_at":"2025-08-03T08:05:12.073Z","repository":{"id":61664674,"uuid":"549966026","full_name":"thunlp/Advbench","owner":"thunlp","description":"Code and data of the EMNLP 2022 paper \"Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP\".","archived":false,"fork":false,"pushed_at":"2023-02-19T20:16:44.000Z","size":161,"stargazers_count":50,"open_issues_count":0,"forks_count":5,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-25T11:42:01.929Z","etag":null,"topics":["adversarial-examples","benchmark","natural-language-processing","security"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thunlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-12T02:16:59.000Z","updated_at":"2025-04-24T05:41:36.000Z","dependencies_parsed_at":"2024-11-10T18:49:31.691Z","dependency_job_id":null,"html_url":"https://github.com/thunlp/Advbench","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/thunlp/Advbench","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FAdvbench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FAdvbench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FAdvbench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FAdvbench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thunlp","download_url":"https://codeload.github.com/thunlp/Advbench/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thunlp%2FAdvbench/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268512159,"owners_count":24261887,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-03T02:00:12.545Z","response_time":2577,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adversarial-examples","benchmark","natural-language-processing","security"],"created_at":"2024-11-10T18:36:32.678Z","updated_at":"2025-08-03T08:05:12.048Z","avatar_url":"https://github.com/thunlp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Advbench\n\nCode and data of the EMNLP 2022 paper **\"Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP\"**[[PDF](https://arxiv.org/pdf/2210.10683v1.pdf)] .\n\n## Overview\n\nIn this paper, we rethink the research paradigm of textual adversarial samples in security scenarios.\nWe discuss the deficiencies in previous work and propose our suggestions that the research on the **S**ecurity-**o**riented **ad**versarial **NLP (SoadNLP) should:**\n(1) evaluate their methods on security tasks to demonstrate the real-world concerns;\n(2) consider real-world attackers' goals, instead of developing impractical methods. \nTo this end, we first collect, process, and release a security datasets collection **advbench**. Then, we reformalize the task and adjust the emphasis on different goals in SoadNLP. Next, we propose a simple method based on heuristic rules that can easily fulfill the actual adversarial goals to simulate real-world attack methods.We conduct experiments on both the attack and the defense sides on Advbenchmark. \nExperimental results show that our method has higher practical value, indicating that the research paradigm in SoadNLP may start from our new benchmark.\n\n\u003cimg src=\"figs/main.png\" alt=\"main\" style=\"zoom:50%;\" /\u003e\n\n## Dependencies\n\n```\npip install -r requirements.txt\n```\n\nMaybe you need to change the version of some libraries depending on your servers.\n\n\n## Data Preparation\n\nFirst, you need to create the file `data` to store dataset:\n\n```\ncd Advbench\nmkdir data\n```\n\nThen you need to download the data from Google Drive[[data](https://drive.google.com/drive/folders/1_2q2282ZEoE_iPg8Q4ILGeB_aAkcP43v?usp=sharing)] .\n\nWe provide the original dataset (**ori_dataset**), the processed dataset (**rel_dataset)**  the experimental dataset (**exp_dataset**) and a pure compression package for experiments. If you just want to reproduce the experiment, you shold download the **data.zip** and save it into `/data`, then unpakage the zip file with the following command:\n```\nunzip data.zip\n```\n\nIf you want to use our benchmark for further research, please download **rel_dataset**. If you want to use raw dataset to process the data yourself, you can download **ori_dataset**. **Exp_dataset** is just an uncompressed format of **data.zip** .\n\n## Experiments\n\nFirst, you need to create the file `model` and `output` to respectively store fine-tuned model and adversarial output dataset.\n```\nmkdir model\nmkdir output\n```\n\nThen you should fine-tune the pre-trained model on our security datasets collection **Advbench**.\n\n```\nbash scripts/train.sh\n```\n\nTo conduct the baseline attack experiments in our settings:\n\n```\nbash scripts/base_attack.sh\n```\n\nTo conduct attack experiments via ROCKET in our settings:\n\n```\nbash scripts/rocket.sh\n```\n\n## Citation\nPlease kindly cite our paper:\n\n```\n@article{chen2022should,\n  title={Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP},\n  author={Chen, Yangyi and Gao, Hongcheng and Cui, Ganqu and Qi, Fanchao and Huang, Longtao and Liu, Zhiyuan and Sun, Maosong},\n  journal={arXiv preprint arXiv:2210.10683},\n  year={2022}\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthunlp%2Fadvbench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthunlp%2Fadvbench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthunlp%2Fadvbench/lists"}