https://github.com/applied-machine-learning-lab/seed-attack

[ACL'25]Code implementation of ACL‘25 paper "Stepwise Reasoning Disruption Attack of LLMs"
https://github.com/applied-machine-learning-lab/seed-attack

Last synced: about 1 year ago
JSON representation

[ACL'25]Code implementation of ACL‘25 paper "Stepwise Reasoning Disruption Attack of LLMs"

README

# [ACL'25 Main] Stepwise Reasoning Disruption Attack of LLMs

Code implementation of ACL paper "Stepwise Reasoning Disruption Attack of LLMs"

## Usage

### 1. Question Modification (`QuestionModification.py`)

This component modifies original questions while preserving their semantic meaning.

```bash
python QuestionModification.py \
--llm_name \
--dataset \
--few_shot \
```

### 2. Solution Generation (`GetSolutionofQuestionModified.py`)

Generates CoT solutions for modified questions.

```bash
python GetSolutionofQuestionModified.py \
--llm_name \
--dataset \
--few_shot \
```

### 3. SEED-P Attack (`SEEDpAttack.py`)

Performs the SEED-P attack by introducing prior reasoning steps of the modified question.

```bash
python SEEDpAttack.py \
--llm_name \
--dataset \
--ratio \
--few_shot \
```

### 4. Evaluation (`Evaluation.py`)

Evaluation the Accuracy and Attack Success Rate.

Step1: Run baseline (no attack) for comparison:

```bash
python SEEDpAttack.py \
--llm_name \
--dataset \
--ratio 0.0 \
--few_shot \
```

Step 2: Compute ASR and Accuracy:

```bash
python Evaluation.py
--llm_name \
--dataset \
--ratio \
--few_shot \
```