{"id":50510392,"url":"https://github.com/jongwooko/distillm-2","last_synced_at":"2026-06-19T14:00:32.034Z","repository":{"id":298279054,"uuid":"945843585","full_name":"jongwooko/distillm-2","owner":"jongwooko","description":"Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)","archived":false,"fork":false,"pushed_at":"2025-06-27T12:19:26.000Z","size":1411,"stargazers_count":25,"open_issues_count":1,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-06-27T13:27:33.851Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2503.07067","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jongwooko.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-10T08:07:21.000Z","updated_at":"2025-06-27T12:19:29.000Z","dependencies_parsed_at":"2025-06-10T09:42:17.021Z","dependency_job_id":"1a3b99bf-d48d-4b90-ba70-ddd2f049aa5a","html_url":"https://github.com/jongwooko/distillm-2","commit_stats":null,"previous_names":["jongwooko/distillm-2"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jongwooko/distillm-2","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwooko%2Fdistillm-2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwooko%2Fdistillm-2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwooko%2Fdistillm-2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwooko%2Fdistillm-2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jongwooko","download_url":"https://codeload.github.com/jongwooko/distillm-2/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwooko%2Fdistillm-2/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34534278,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-19T02:00:06.005Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-02T20:00:26.251Z","updated_at":"2026-06-19T14:00:32.029Z","avatar_url":"https://github.com/jongwooko.png","language":"Python","funding_links":[],"categories":["🔬 OPD with Larger External Teachers — White-Box"],"sub_categories":[],"readme":"# DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)\n\n[![arXiv](https://img.shields.io/badge/Paper-arXiv:2503.07067-Green)](https://arxiv.org/abs/2503.07067)  [![BibTex](https://img.shields.io/badge/Paper-BibTex-yellow)](#bibtex)\n\nOfficial PyTorch implementation of **DistiLLM-2**, as presented in our paper:  \n[**DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs**](https://arxiv.org/abs/2503.07067)  \nby [Jongwoo Ko](https://sites.google.com/view/jongwooko)\u003csup\u003e1,2\u003c/sup\u003e, Tianyi Chen\u003csup\u003e2\u003c/sup\u003e, [Sungnyun Kim](https://sungnyunkim.notion.site/Sungnyun-Kim-4770a0182c47469ebdcd357cde97bd32)\u003csup\u003e1\u003c/sup\u003e, Tianyu Ding\u003csup\u003e2\u003c/sup\u003e, Luming Liang\u003csup\u003e2\u003c/sup\u003e, Ilya Zharkov\u003csup\u003e2\u003c/sup\u003e, and Se-Young Yun\u003csup\u003e1\u003c/sup\u003e  \n\u003csup\u003e1\u003c/sup\u003eKAIST AI \u0026nbsp;\u0026nbsp; \u003csup\u003e2\u003c/sup\u003eMicrosoft\n\n---\n\n## 🚀 Updates\n- [x] (25.06.10) The official code implementation is finally out.\n- [x] (25.06.09) DistiLLM-2 has been selected for an ***oral presentation at ICML (_top 1%_)***.\n- [x] (25.03.11) DistiLLM-2 paper is out! The preliminary code will be available in this repo, and final code will be available in [here](https://github.com/jongwooko/distillm-2).\n\n--- \n\n## 🔧 Environment Setup\n\nOur codebase builds upon the [alignment-handbook repository](https://github.com/huggingface/alignment-handbook). Follow the steps below to set up your environment:\n\n1. Create a Python virtual environment using e.g. Conda:\n```bash\nconda create -n distillm2 python=3.10 \u0026\u0026 conda activate distillm2\n```\n\n2. install PyTorch `v2.4.0`. Installation is hardware-dependent, so please refer to the [PyTorch Installation Page](https://pytorch.org/get-started/locally/). \n\n3. Install the remaining dependencies:\n```bash\npython -m pip install .\n```\n\n4. Install FlashAttention-2, which can be done by running:\n\n```bash\npython -m pip install flash-attn --no-build-isolation\n```\n\n5. (Optional) If you are running decoding with `gemma-2` models, you will also need to install `flashinfer`.\n\n```shell\npython -m pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4\n```\n\n## 🚀 Generation\n\n0. (Optional) Perform supervised fine-tuning before running DistiLLM-2. This step can be skipped if you are using an instruction-tuned model as the student. We recommend performing this step. However, if you choose to skip it, we suggest reducing the number of training iterations for DistiLLM-2 as described in Appendix D.2.\n\n```bash\naccelerate launch --config_file accelerate_configs/deepspeed_zero3.yaml --num_processes=4 src/run_sft.py training_configs/qwen2.5-1.5b-sft.yaml\n```\n\n1. Generate responses using the language model:\n\n```bash\npython generate/generate_vllm.py --model $MODEL_DIR --output_dir $OUTPUT_DIR --seed $SEED\n```\n\nThis script generates one response per prompt. You can specify your prompt dataset (by default, we use `HuggingFaceH4/ultrachat_200k`). You can also set decoding hyperparameters by passing in the corresponding arguments (by default, we use a temperature of `0.8` for sampling).\n\n2. Reformat into teacher-student pairs:\n\n```bash\npython generate/reformat.py --teacher_file $TEACHER_DIR --student_file $STUDENT_DIR --output_dir $OUTPUT_DIR\n```\n\n## 🏋️ Training\n\n**Note**: Some LLMs (e.g., Qwen2.5) use different classifier head sizes depending on the model scale. To align them before distillation:\n```Shell\npython utils/resize_embedding.py --teacher-model Qwen/Qwen2.5-7B-Instruct --student-model Qwen/Qwen2.5-0.5B-Instruct\n```\n\nWe provide training configuration files for the four setups described in the paper. Each configuration is designed for either a 2×A100 or 4×A100 GPU setup. You may need to adjust `num_processes` and `per_device_train_batch_size` to match your environment.\nYou can modify the student and teacher models by changing the values of `model_name_or_path` and `ref_model_name_or_path` in the configuration file.\n\n* Qwen2.5-1.5B-DistiLLM2 (4xA100):\n```bash\naccelerate launch --config_file accelerate_configs/deepspeed_zero3.yaml --num_processes=4 src/run_distillm.py training_configs/qwen2.5-1.5b-distillm2.yaml\n```\n\n* Deepseek-Coder-1.3B-DistiLLM2 (4xA100):\n```bash\naccelerate launch --config_file accelerate_configs/deepspeed_zero3.yaml --num_processes=4 src/run_distillm.py training_configs/deepseek-coder-1.3b-distillm2.yaml\n```\n\n* TinyLLaVA (4xA100):\n```bash\naccelerate launch --config_file accelerate_configs/deepspeed_zero3.yaml --num_processes=4 src/run_distivlm.py training_configs/vlm.yaml\n```\n\n## 📊 Evaluation\n\nFor our evaluation benchmark, we use AlpacaEval, Evol-Instruct, and UltraFeedback. We generate responses for pairwise comparison with the LLM-as-a-Judge framework. For AlpacaEval, we use the official response from `text-davinci-003`. For Evol-Instruct and UltraFeedback, we use responses from `gpt-3.5-turbo`. As judge models, we use `GPT-4o` for AlpacaEval and Evol-Instruct, and `GPT-4o-mini` for UltraFeedback.\n\n**Note** The MATH dataset (Hendrycks et al., 2021) is currently blocked. Although we use the `hendrycks/competition_math` dataset on Hugging Face for prompts, it is not available for use at this time.\n\n1. Generate outputs for evaluation (e.g., Evol-Instruct):\n\n```shell\npython utils/merging.py --base-model-name ${STUDENT_DIR} --lora-model-name ${LORA_DIR}\n\npython generate/generate_vllm.py --model ${LORA_DIR}/merged --output_dir ${OUTPUT_DIR} --data_dir evol-instruct --seed 200\n```\n\n2. Run LLM-as-a-Judge evaluation (e.g., `gpt-4o`): \n\n```shell\npython eval/build_evaluation.py --data-path1 eval/evol-instruct/evol_inst_eval.json --data-path2 $OUTPUT_DIR/output_200.json --pairwise --output-file evol_inst-${EXP_NAME} --judge gpt-4o\n\npython eval/build_evaluation.py --data-path2 eval/evol-instruct/evol_inst_eval.json --data-path1 $OUTPUT_DIR/output_200.json --pairwise --output-file ${EXP_NAME}-evol_inst --judge gpt-4o\n\nbash eval/run.sh ${3} ${EXP_NAME}\n```\n\n## 📚 BibTeX\nIf you find this repo useful for your research, please consider citing us:\n\n```\n@inproceedings{kodistillm,\n  title={DistiLLM: Towards Streamlined Distillation for Large Language Models},\n  author={Ko, Jongwoo and Kim, Sungnyun and Chen, Tianyi and Yun, Se-Young},\n  booktitle={Forty-first International Conference on Machine Learning}\n}\n\n@inproceedings{\nko2025distillm,\ntitle={Disti{LLM}-2: A Contrastive Approach Boosts the Distillation of {LLM}s},\nauthor={Jongwoo Ko and Tianyi Chen and Sungnyun Kim and Tianyu Ding and Luming Liang and Ilya Zharkov and Se-Young Yun},\nbooktitle={Forty-second International Conference on Machine Learning},\nyear={2025},\nurl={https://openreview.net/forum?id=rc65N9xIrY}\n}\n```\n\n## ✉️ Contact\nIf you have any questions or feedback, feel free to reach out:\n- Jongwoo Ko: jongwoo.ko@kaist.ac.kr\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjongwooko%2Fdistillm-2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjongwooko%2Fdistillm-2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjongwooko%2Fdistillm-2/lists"}