{"id":28098998,"url":"https://github.com/thu-keg/dice","last_synced_at":"2025-05-13T17:59:21.466Z","repository":{"id":243160285,"uuid":"806315236","full_name":"THU-KEG/DICE","owner":"THU-KEG","description":"DICE: Detecting In-distribution Data Contamination with LLM's Internal State","archived":false,"fork":false,"pushed_at":"2024-06-12T01:46:29.000Z","size":3419,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-06-12T21:32:23.101Z","etag":null,"topics":["benchmark","data-contamination","fine-tuning-llm","gsm8k","llm","sft"],"latest_commit_sha":null,"homepage":"https://arxiv.org/pdf/2406.04197","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/THU-KEG.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-27T00:20:06.000Z","updated_at":"2024-06-12T06:48:53.000Z","dependencies_parsed_at":"2024-06-07T03:41:52.605Z","dependency_job_id":"782d739a-f3d6-46b8-8b46-ad87bd7e54c6","html_url":"https://github.com/THU-KEG/DICE","commit_stats":null,"previous_names":["thu-keg/dice"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THU-KEG%2FDICE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THU-KEG%2FDICE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THU-KEG%2FDICE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THU-KEG%2FDICE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/THU-KEG","download_url":"https://codeload.github.com/THU-KEG/DICE/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254000173,"owners_count":21997402,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","data-contamination","fine-tuning-llm","gsm8k","llm","sft"],"created_at":"2025-05-13T17:59:20.428Z","updated_at":"2025-05-13T17:59:21.452Z","avatar_url":"https://github.com/THU-KEG.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DICE\nDICE: Detecting In-distribution Data Contamination with LLM's Internal State\nData and Code for the paper.\n\n## Installation\n\n``` bash\ngit clone https://github.com/THU-KEG/DICE.git\ncd DICE\n```\n\n## Reproducing Results\n\n### Step 1: Fine-tune the contaminated model\n\n\nOur code to fine-tune contaminated model is stored in the `OOD_test/scripts` folder. \n\n#### paraphrase benchmark\n\n``` bash\npython scripts/rewrite.py --dataset_name gsm8k\n```\nThe paraphrased dataset we used in the paper is available in the `OOD_test/scripts/data` folder.\n\n#### fine-tune \n\n- You can fine-tune a contaminated model as follows. Change the base model by `--model_name`.\n\n- Change the contaminated benchmark by changing the `--train_dataset_name` and `--dataset_name`.\n\n- The parameter `--epoch 1` represents the 2% contamination setting in the paper. Omitting it represents the 10% setting.\n\n``` bash\ncd OOD_test\nCUDA_VISIBLE_DEVICES=0 python scripts/contaminated_finetune.py \\\n--model_name microsoft/phi-2 \\\n--generative_batch_size 32 \\\n--dataset_name gsm8k \\\n--train_dataset_name gsm8k \\\n--epochs 1\n```\n\n#### fine-tune scripts\n\nYou can also use the following script to directly reproduce the contaminated model of the main experiment in our paper.\n\n``` bash\nCUDA_VISIBLE_DEVICES=0 bash scripts/contaminated_finetune.sh\n```\n\n### Step 2: OOD Performance of contaminated models\n\nSimilar to the fine-tuning process above, you can use the following scripts to test OOD performance.\n\nThe parameter settings are the same as above. The only thing to note is that `--dataset_name` is the OOD dataset to be tested, and `--train_dataset_name` is the contaminated dataset.\n\n``` bash\ncd OOD_test\nCUDA_VISIBLE_DEVICES=0 python OOD_generate_inf.py \\\n--model_name microsoft/phi-2 \\\n--generative_batch_size 32 \\\n--dataset_name math \\\n--train_dataset_name gsm8k \\\n--epochs 1\n```\n\n### Step 3: Locate contaminated layer\n\nCode of this part is stored in the `Locate` folder. \n\n```bash\nCUDA_VISIBLE_DEVICES=0 python DICE_locate.py \\\n--edited_model=meta-llama/Llama-2-7b-hf \\\n--hparams_dir=../hparams/DICE/llama-7b \n```\n\n### Step 4: Train and test DICE detector\n\nCode of this part is stored in the `contamination_classifier` folder. \n\n#### make data (hidden states of contaminated layer)\n\nYou can use the following script to get the data.\n\n- You can fine-tune a contaminated model as follows. You can change the base model by `--model_name`.\n\n- Change the detect benchmark by `--test_dataset`.\n\n- `--is_contaminated` shows whether the model is contaminated.\n\n- `--model_type` indicates whether the uncontaminated model is the vanilla model or the model fine-tuned only on orca.\n\n- `--contaminated_type` indicates whether the contaminated model is a fine-tuned version of the original benchmark (open) or a paraphrased benchmark (Evasive).\n\n\n```bash\ncd contamination_classifier\nCUDA_VISIBLE_DEVICES=0 python data_maker.py \\\n--edited_model=meta-llama/Llama-2-7b-hf \\\n--hparams_dir=../hparams/DICE/llama-7b \\\n--test_dataset=GSM8K_seen \\\n--is_contaminated=True \\\n--model_type=vanilla \\\n--contaminated_type=open\n```\nYou can also use the following script to directly reproduce test data of the main experiment in our paper.\n\n``` bash\nCUDA_VISIBLE_DEVICES=0 bash scripts/make_test_data.sh\n```\n\n#### train and test DICE detector\n\nUse `train_test.py` to train and test a DICE.\n\nYou can simply use the following script to directly reproduce test results of the main experiment in our paper.\n\n``` bash\nCUDA_VISIBLE_DEVICES=0 bash scripts/Test_DICE.sh\n```\n\n##### other experiment\n\nThe `contamination_classifier` folder contains the code for the main experiments in the paper, including the `performance_vs_score` subfolder that stores the code for the experiment to test the relationship between contaminated probability and model performance,  `draw_OOD.py` is the code for drawing the detection distribution of the OOD dataset, and so on.\n\n\n# Acknowledgements\n\nOur implementation is based on the repository of the paper \"Evading Data Contamination Detection for Language Models is (too) Easy\" by Jasper Dekoninck, Mark Niklas Müller, Maximilian Baader, Marc Fischer, and Martin Vechev. The original repository can be found [here](https://github.com/eth-sri/malicious-contamination/). Their LICENSE file can be found in the `OOD_test` folder as well. We have made some modifications to the code to adapt it to our needs.\n\nWe wish to express our appreciation to the pioneers in the field of [evasive data contamination](https://arxiv.org/pdf/2402.02823). Our work was developed as a way to address the attack presented in the evasive data contamination.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthu-keg%2Fdice","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthu-keg%2Fdice","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthu-keg%2Fdice/lists"}