https://github.com/timlrx/igcg
https://github.com/timlrx/igcg
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/timlrx/igcg
- Owner: timlrx
- Created: 2024-08-03T04:44:46.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-06T00:17:38.000Z (almost 2 years ago)
- Last Synced: 2024-10-04T19:44:21.849Z (over 1 year ago)
- Language: Python
- Size: 1.72 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# I-GCG
The official repository for [Improved Techniques for Optimization-Based Jailbreaking on Large Language Models](https://arxiv.org/abs/2405.21018).
Please feel free to contact jiaxiaojunqaq@gmail.com if you have any question.
## Quick Start
### 1. Generate suffix initialization
```python
python attack_llm_core_best_update_our_target.py --behaviors_config=behaviors_ours_config.json
```
### 2. Generate new json with the initialization
```python
python generate_our_config.py
```
### 3. Conduct jailbreaking attack
```python
python run_multiple_attack_our_target.py --behaviors_config=behaviours_gcss_config_init_v2_continued.json --output_path=gcss --model_path="/home/LLM/Llama-2-7b-chat-hf"
```
## Experiments
### Comparison results with SOTA jailbreak methods
### Transferable performance of jailbreak suffix
## Citation
Kindly include a reference to this paper in your publications if it helps your research:
```
@article{jia2024improved,
title={Improved Techniques for Optimization-Based Jailbreaking on Large Language Models},
author={Xiaojun Jia and Tianyu Pang and Chao Du and Yihao Huang and Jindong Gu and Yang Liu and Xiaochun Cao and Min Lin},
year={2024},
eprint={2405.21018}
}
```