https://github.com/applied-machine-learning-lab/seed-attack
[ACL'25]Code implementation of ACL‘25 paper "Stepwise Reasoning Disruption Attack of LLMs"
https://github.com/applied-machine-learning-lab/seed-attack
Last synced: 9 months ago
JSON representation
[ACL'25]Code implementation of ACL‘25 paper "Stepwise Reasoning Disruption Attack of LLMs"
- Host: GitHub
- URL: https://github.com/applied-machine-learning-lab/seed-attack
- Owner: Applied-Machine-Learning-Lab
- Created: 2024-12-09T08:52:04.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-03T04:11:10.000Z (10 months ago)
- Last Synced: 2025-06-03T16:25:03.236Z (10 months ago)
- Language: Python
- Homepage:
- Size: 16 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# [ACL'25 Main] Stepwise Reasoning Disruption Attack of LLMs
Code implementation of ACL paper "Stepwise Reasoning Disruption Attack of LLMs"
## Usage
### 1. Question Modification (`QuestionModification.py`)
This component modifies original questions while preserving their semantic meaning.
```bash
python QuestionModification.py \
--llm_name \
--dataset \
--few_shot \
```
### 2. Solution Generation (`GetSolutionofQuestionModified.py`)
Generates CoT solutions for modified questions.
```bash
python GetSolutionofQuestionModified.py \
--llm_name \
--dataset \
--few_shot \
```
### 3. SEED-P Attack (`SEEDpAttack.py`)
Performs the SEED-P attack by introducing prior reasoning steps of the modified question.
```bash
python SEEDpAttack.py \
--llm_name \
--dataset \
--ratio \
--few_shot \
```
### 4. Evaluation (`Evaluation.py`)
Evaluation the Accuracy and Attack Success Rate.
Step1: Run baseline (no attack) for comparison:
```bash
python SEEDpAttack.py \
--llm_name \
--dataset \
--ratio 0.0 \
--few_shot \
```
Step 2: Compute ASR and Accuracy:
```bash
python Evaluation.py
--llm_name \
--dataset \
--ratio \
--few_shot \
```