Awesome-Knowledge-Distillation-of-LLMs
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & Vertical Distillation of LLMs.
https://github.com/Tebmer/Awesome-Knowledge-Distillation-of-LLMs
Last synced: about 17 hours ago
JSON representation
-
Skill Distillation
-
NLP Task Specialization
- **TinyLLM: Learning a Small Student from Multiple Large Language Models** - 02 |
- **SunGen: Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning** - 05 | [Github](https://github.com/SumilerGAO/SunGen)
- **Magicoder: Source Code Is All You Need** - 12 | [Github](https://github.com/ise-uiuc/magicoder) | [Data](https://huggingface.co/datasets/ise-uiuc/Magicoder-OSS-Instruct-75K) <br> [Data](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K)|
- **WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation** - 12 |
- **Soft prompt tuning for augmenting dense retrieval with large language models** - 07 | [Github](https://github.com/zhiyuanpeng/SPTAR.git)
- **Query Rewriting in Retrieval-Augmented Large Language Models** - 05
- **AugTriever: Unsupervised Dense Retrieval by Scalable Data Augmentation** - 12 | [Github](https://github.com/salesforce/AugTriever)
- **QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation** - 10 |
- **Promptagator: Few-shot Dense Retrieval From 8 Examples** - 09 |
- **Questions Are All You Need to Train a Dense Passage Retrieval** - 06 | [Github](https://github.com/DevSinghSachan/art) |
- **Data Augmentation for Radiology Report Simplification** - 04 | [Github](https://github.com/Ziyu-Yang/Radiology-Text-Simplification-Liver)
- **InstructDistill: Instruction Distillation Makes Large Language Models Efficient Zero-shot Rankers** - 11 | [Github](https://github.com/sunnweiwei/RankGPT/tree/main/InstructDistill)| [Data](https://github.com/sunnweiwei/RankGPT?tab=readme-ov-file#download-data-and-model)
- Github - ov-file#data-release)|
- **LLM vs Small Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model** - 03 |
- **Evolving Knowledge Distillation with Large Language Models and Active Learning** - 03 |
- **PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation** - 10 | [Github](https://github.com/ServiceNow/PromptMix-EMNLP-2023) |
- **TinyLLM: Learning a Small Student from Multiple Large Language Models** - 02 |
- **Targeted Data Generation: Finding and Fixing Model Weaknesses** - 05 | [Github](https://github.com/ZexueHe/TDG)|
- **Distilling ChatGPT for Explainable Automated Student Answer Assessment** - 05 | [Github](https://github.com/lijiazheng99/aera) |
- **AugGPT: Leveraging ChatGPT for Text Data Augmentation** - 02 | [Github](https://github.com/yhydhx/AugGPT)|
- **Is GPT-3 a Good Data Annotator?** - 12 | [Github](https://github.com/DAMO-NLP-SG/LLM-Data-Annotator)|
- **SunGen: Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning** - 05 | [Github](https://github.com/SumilerGAO/SunGen)
- **ZeroGen: Efficient Zero-shot Learning via Dataset Generation** - 02 | [Github](https://github.com/jiacheng-ye/ZeroGen)|
- **Generating Training Data with Language Models: Towards Zero-Shot Language Understanding** - 02 | [Github](https://github.com/yumeng5/SuperGen)
- **Towards Zero-Label Language Learning** - 09 |
- **Generate, Annotate, and Learn: NLP with Synthetic Text** - 06
- **Tailoring Self-Rationalizers with Multi-Reward Distillation** - 11 | [Github](https://inklab.usc.edu/MaRio/)| [Data](https://inklab.usc.edu/MaRio/)|
- **Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing** - 05 | [Github](https://github.com/jaehunjung1/impossible-distillation)|
- **RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation** - 10 | [Github](https://github.com/carriex/recomp)|
- **Neural Machine Translation Data Generation and Augmentation using ChatGPT** - 07 |
- **On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes** - 06 |
- **Can LLMs generate high-quality synthetic note-oriented doctor-patient conversations?** - 06 | [Github](https://github.com/believewhat/Dr.NoteAid) | [Data](https://huggingface.co/datasets/akemiH/NoteChat)|
- **InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT** - 05 |
- **Improving Passage Retrieval with Zero-Shot Question Generation** - 04 | [Github](https://github.com/DevSinghSachan/unsupervised-passage-reranking) | [Data](https://github.com/DevSinghSachan/unsupervised-passage-reranking)|
- **Generating Datasets with Pretrained Language Models** - 04 | [Github](https://github.com/timoschick/dino) |
- **ONCE: Boosting Content-based Recommendation with Both Open- and Closed-source Large Language Models** - 05 | [Github](https://github.com/Jyonn/ONCE) | [Data](https://github.com/Jyonn/ONCE/releases/tag/Dataset)
- **Can Small Language Models be Good Reasoners for Sequential Recommendation?** - 03 |
- **Large Language Model Augmented Narrative Driven Recommendations** - 06 |
- **Recommendation as Instruction Following: A Large Language Model Empowered Recommendation Approach** - 05 |
- **Prometheus: Inducing Fine-grained Evaluation Capability in Language Models** - 10 | [Github](https://github.com/kaistAI/Prometheus) | [Data](https://huggingface.co/datasets/kaist-ai/Feedback-Collection)|
- **TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks** - 10 | [Github](https://tiger-ai-lab.github.io/TIGERScore/) | [Data](https://huggingface.co/datasets/TIGER-Lab/MetricInstruct)|
- **Generative Judge for Evaluating Alignment** - 10 | [Github](https://github.com/GAIR-NLP/auto-j) | [Data](https://github.com/GAIR-NLP/auto-j)
- **INSTRUCTSCORE: Explainable Text Generation Evaluation with Fine-grained Feedback** - 05 | [Github](https://github.com/xu1998hz/InstructScore_SEScore3) | [Data](https://github.com/xu1998hz/InstructScore_SEScore3)
- **Instruction Fusion: Advancing Prompt Evolution through Hybridization** - 12 |
- **MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning** - 11 | [Github](https://github.com/codefuse-ai/MFTCOder)| [Data](https://huggingface.co/datasets/codefuse-ai/Evol-instruction-66k) <br> [Data](https://huggingface.co/datasets/codefuse-ai/CodeExercise-Python-27k)|
- **Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation** - 10 | [Github](https://github.com/SalesforceAIResearch/PersDistill)|
- **LLM-Assisted Code Cleaning For Training Accurate Code Generators** - 11
- **Code Llama: Open Foundation Models for Code** - 08 | [Github](https://github.com/facebookresearch/codellama)|
- **Distilled GPT for Source Code Summarization** - 08 | [Github](https://github.com/apcl-research/jam-cgpt) | [Data](https://huggingface.co/datasets/apcl/Jam-CGPT/tree/main)|
- **Textbooks Are All You Need: A Large-Scale Instructional Text Data Set for Language Models** - 06 |
- **MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning** - 11 | [Github](https://github.com/codefuse-ai/MFTCOder)| [Data](https://huggingface.co/datasets/codefuse-ai/Evol-instruction-66k) <br> [Data](https://huggingface.co/datasets/codefuse-ai/CodeExercise-Python-27k)|
- **LLM vs Small Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model** - 03 |
- **Evolving Knowledge Distillation with Large Language Models and Active Learning** - 03 |
- **Mixed Distillation Helps Smaller Language Model Better Reasoning** - 12 |
- **Towards Zero-Label Language Learning** - 09 |
- **Generate, Annotate, and Learn: NLP with Synthetic Text** - 06
- **Tailoring Self-Rationalizers with Multi-Reward Distillation** - 11 | [Github](https://inklab.usc.edu/MaRio/)| [Data](https://inklab.usc.edu/MaRio/)|
- **Neural Machine Translation Data Generation and Augmentation using ChatGPT** - 07 |
- **On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes** - 06 |
- **Can LLMs generate high-quality synthetic note-oriented doctor-patient conversations?** - 06 | [Github](https://github.com/believewhat/Dr.NoteAid) | [Data](https://huggingface.co/datasets/akemiH/NoteChat)|
- **InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT** - 05 |
- **Distilling ChatGPT for Explainable Automated Student Answer Assessment** - 05 | [Github](https://github.com/lijiazheng99/aera) |
- **ChatGPT outperforms crowd workers for text-annotation tasks** - 03 |
- **ZeroGen: Efficient Zero-shot Learning via Dataset Generation** - 02 | [Github](https://github.com/jiacheng-ye/ZeroGen)|
- **Generating Training Data with Language Models: Towards Zero-Shot Language Understanding** - 02 | [Github](https://github.com/yumeng5/SuperGen)
- **PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization** - 06 | [Github](https://github.com/WeOpenML/PandaLM)| [Data](https://github.com/WeOpenML/PandaLM)|
- **Prometheus: Inducing Fine-grained Evaluation Capability in Language Models** - 10 | [Github](https://github.com/kaistAI/Prometheus) | [Data](https://huggingface.co/datasets/kaist-ai/Feedback-Collection)|
- **TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks** - 10 | [Github](https://tiger-ai-lab.github.io/TIGERScore/) | [Data](https://huggingface.co/datasets/TIGER-Lab/MetricInstruct)|
- **Generative Judge for Evaluating Alignment** - 10 | [Github](https://github.com/GAIR-NLP/auto-j) | [Data](https://github.com/GAIR-NLP/auto-j)
- **Soft prompt tuning for augmenting dense retrieval with large language models** - 07 | [Github](https://github.com/zhiyuanpeng/SPTAR.git)
- **Query Rewriting in Retrieval-Augmented Large Language Models** - 05
- **AugTriever: Unsupervised Dense Retrieval by Scalable Data Augmentation** - 12 | [Github](https://github.com/salesforce/AugTriever)
- **QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation** - 10 |
- **Questions Are All You Need to Train a Dense Passage Retrieval** - 06 | [Github](https://github.com/DevSinghSachan/art) |
- **InPars: Data Augmentation for Information Retrieval using Large Language Models** - 02 | [Github](https://github.com/zetaalphavector/inpars)| [Data](https://github.com/zetaalphavector/inpars)|
- **Can Small Language Models be Good Reasoners for Sequential Recommendation?** - 03 |
- **LLM-Assisted Code Cleaning For Training Accurate Code Generators** - 11
- **Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation** - 10 | [Github](https://github.com/SalesforceAIResearch/PersDistill)|
- **Distilled GPT for Source Code Summarization** - 08 | [Github](https://github.com/apcl-research/jam-cgpt) | [Data](https://huggingface.co/datasets/apcl/Jam-CGPT/tree/main)|
- **Instruction Fusion: Advancing Prompt Evolution through Hybridization** - 12 |
- **PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation** - 10 | [Github](https://github.com/ServiceNow/PromptMix-EMNLP-2023) |
- **Targeted Data Generation: Finding and Fixing Model Weaknesses** - 05 | [Github](https://github.com/ZexueHe/TDG)|
- **Annollm: Making large language models to be better crowdsourced annotators** - 03 |
- **AugGPT: Leveraging ChatGPT for Text Data Augmentation** - 02 | [Github](https://github.com/yhydhx/AugGPT)|
- **Is GPT-3 a Good Data Annotator?** - 12 | [Github](https://github.com/DAMO-NLP-SG/LLM-Data-Annotator)|
-
Context Following
- **LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions** - 04 | [Github](https://github.com/mbzuai-nlp/LaMini-LM?tab=readme-ov-file) | [Data](https://huggingface.co/datasets/MBZUAI/LaMini-instruction)|
- **Wizardlm: Empowering large language models to follow complex instructions** - 04 | [Github](https://github.com/nlpxucan/WizardLM)| [Data](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_70k) <br> [Data](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k)|
- **Alpaca: Aligning Language Model with Human Preferences** - | 2023-03 | [Github](https://github.com/tatsu-lab/stanford_alpaca)| [Data](https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json)|
- **Enhancing Chat Language Models by Scaling High-quality Instructional Conversations** - 05 | [Github](https://github.com/thunlp/UltraChat) | [Data](https://huggingface.co/datasets/stingning/ultrachat)|
- **Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data** - 04 | [Github](https://github.com/project-baize/baize-chatbot)| [Data](https://github.com/project-baize/baize-chatbot/tree/main/data)|
- **Revisiting Knowledge Distillation for Autoregressive Language Models** - 02 |
- **What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning** - 12 | [Github](https://github.com/hkust-nlp/deita) | [Data](https://github.com/hkust-nlp/deita)|
- **MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following** - 12 | [Github](https://github.com/RenzeLou/Muffin) | [Data](https://huggingface.co/datasets/Reza8848/MUFFIN_68k)|
- **ExpertPrompting: Instructing Large Language Models to be Distinguished Experts** - 05 | [Github](https://github.com/OFA-Sys/ExpertLLaMA) | [Data](https://github.com/OFA-Sys/ExpertLLaMA)|
- **Koala: A Dialogue Model for Academic Research** - | 2023-04 | [Github](https://github.com/lm-sys/FastChat)| [Data](https://huggingface.co/datasets/lmsys/chatbot_arena_conversations)|
- **Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection** - 10 | [Github](https://selfrag.github.io/) | [Data](https://selfrag.github.io/)|
- **SAIL: Search-Augmented Instruction Learning** - 05 | [Github](https://openlsr.org/sail-7b) | [Data](https://github.com/luohongyin/SAIL#reproducing-sail-models)|
- **Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks** - 05 | [Github](https://github.com/Nardien/KARD) | [Data](https://github.com/Nardien/KARD)|
- **Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90\%* ChatGPT Quality** - | 2023-03 | [Github](https://github.com/lm-sys/FastChat)| [Data](https://huggingface.co/datasets/lmsys/chatbot_arena_conversations)|
- **Self-instruct: Aligning language model with self generated instructions** - 12 | [Github](https://github.com/yizhongw/self-instruct)| [Data](https://github.com/yizhongw/self-instruct) |
- **Revisiting Knowledge Distillation for Autoregressive Language Models** - 02 |
- **Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models** - 02 |
- **Phi-2: The surprising power of small language models** - | 2023-12 |
- **Textbooks Are All You Need II: Phi-1.5 Technical Report** - 09 |
-
Alignment
- **Aligning Large and Small Language Models via Chain-of-Thought Reasoning** - 03 | [Github](https://github.com/lranaldii/Aligning_LLMs) |
- **Divide-or-Conquer? Which Part Should You Distill Your LLM?** - 02 |
- **Ultrafeedback: Boosting language models with high-quality feedback** - 10 | [Github](https://github.com/thunlp/UltraFeedback) | [Data](https://huggingface.co/datasets/openbmb/UltraFeedback)|
- **Reward Design with Language Models** - 03 | [Github](https://github.com/minaek/reward_design_with_llms)|
- **SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation** - 05
- **Rlaif: Scaling Reinforcement Learning from Human Feedback with AI Feedback** - 09 |
- **Divide-or-Conquer? Which Part Should You Distill Your LLM?** - 02 |
- **Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements** - 02 | [Github](https://github.com/tianyi-lab/DEBATunE) | [Data](https://github.com/tianyi-lab/DEBATunE)|
- **Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering** - 11 | [Github](https://github.com/zjukg/KnowPAT) |
- **Orca 2: Teaching Small Language Models How to Reason** - 11 |
- **Ultrafeedback: Boosting language models with high-quality feedback** - 10 | [Github](https://github.com/thunlp/UltraFeedback) | [Data](https://huggingface.co/datasets/openbmb/UltraFeedback)|
- **Zephyr: Direct Distillation of LM Alignment** - 10 | [Github](https://github.com/huggingface/alignment-handbook) | [Data](https://github.com/huggingface/alignment-handbook)|
- **RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment** - 07 | [Github](https://github.com/facebookresearch/rlcd)|
- **Aligning Large Language Models through Synthetic Feedbacks** - 05 | [Github](https://github.com/naver-ai/almost)|[Data](https://github.com/naver-ai/almost)|
- **Reward Design with Language Models** - 03 | [Github](https://github.com/minaek/reward_design_with_llms)|
- **Training Language Models with Language Feedback at Scale** - 03 |
- **Constitutional AI: Harmlessness from AI Feedback** - 12 |
- **Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision** - 05 | [Github](https://github.com/IBM/Dromedary) | [Data](https://huggingface.co/datasets/zhiqings/dromedary-65b-verbose-clone-v0)|
- **Training Socially Aligned Language Models on Simulated Social Interactions** - 05 |
- **Orca: Progressive Learning from Complex Explanation Traces of GPT-4** - 06 |
- **Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning** - 02 | [Github](https://github.com/tianyi-lab/Reflection_Tuning) | [Data](https://github.com/tianyi-lab/Reflection_Tuning)|
- **Orca 2: Teaching Small Language Models How to Reason** - 11 |
- **SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation** - 05
- **Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning** - 10 | [Github](https://github.com/tianyi-lab/Reflection_Tuning) | [Data](https://github.com/tianyi-lab/Reflection_Tuning)|
- **OPENCHAT: ADVANCING OPEN-SOURCE LANGUAGE MODELS WITH MIXED-QUALITY DATA** - 09 | [Github](https://github.com/imoneoi/openchat) | [Data](https://github.com/imoneoi/openchat)|
-
Agent
- **Toolformer: Language Models Can Teach Themselves to Use Tools** - 02 |
- **Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Augmented by ChatGPT** - 04 | [Github](https://github.com/jwzhanggy/Graph_Toolformer) | [Data](https://github.com/jwzhanggy/Graph_Toolformer)|
- **Gorilla: Large Language Model Connected with Massive APIs** - 05 | [Github](https://gorilla.cs.berkeley.edu/) | [Data](https://gorilla.cs.berkeley.edu/)|
- **GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction** - 05 | [Github](https://github.com/AILab-CVC/GPT4Tools) | [Data](https://github.com/AILab-CVC/GPT4Tools)|
- **ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases** - 06 | [Github](https://github.com/tangqiaoyu/ToolAlpaca) | [Data](https://github.com/tangqiaoyu/ToolAlpaca)|
- **ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs** - 07 | [Github](https://github.com/OpenBMB/ToolBench) | [Data](https://github.com/OpenBMB/ToolBench)|
- **Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum** - 08 | [Github](https://github.com/shizhl/Confucius) |
- **CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets** - 09 | [Github](https://github.com/lifan-yuan/CRAFT) |
- **MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning** - 01 | [Github](https://github.com/MLLM-Tool/MLLM-Tool) | [Data](https://github.com/MLLM-Tool/MLLM-Tool)|
- **Small LLMs Are Weak Tool Learners: A Multi-LLM Agent** - 01 |[Github](https://github.com/X-PLUG/Multi-LLM-Agent) |
- **EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction** - 01 |[Github](https://github.com/microsoft/JARVIS/) |
- **AUTOACT: Automatic Agent Learning from Scratch via Self-Planning** - 01 | [Github](https://github.com/zjunlp/AutoAct)
- **Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs** - 11 | [Github](https://allenai.github.io/lumos/) | [Data](https://allenai.github.io/lumos/)|
- **TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems** - 11 |
- **Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld** - 11 |
- **FireAct: Toward Language Agent Fine-tuning** - 10 | [Github](https://fireact-agent.github.io/) | [Data](https://fireact-agent.github.io/)|
- **AgentTuning: Enabling Generalized Agent Abilities for LLMs** - 10 | [Github](https://github.com/THUDM/AgentTuning) |
- **Eureka: Human-Level Reward Design via Coding Large Language Models** - 10 | [Github](https://github.com/eureka-research/Eureka)
- **Language Instructed Reinforcement Learning for Human-AI Coordination** - 04 |
- **Guiding Pretraining in Reinforcement Learning with Large Language Models** - 02 |
- **Distilling Internet-Scale Vision-Language Models into Embodied Agents** - 01 |
- **Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models** - 11 |
- **Motif: Intrinsic Motivation from Artificial Intelligence Feedback** - 10 | [Github](https://github.com/facebookresearch/motif) |
- **Eureka: Human-Level Reward Design via Coding Large Language Models** - 10 | [Github](https://github.com/eureka-research/Eureka)
-
Multi-Modality
- **Miko: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery** - 02 |
- **Localizing Visual Commonsense Knowledge in Large Language Models** - 12 | [Github](https://github.com/jamespark3922/localized-skd) | [Data](https://github.com/jamespark3922/localized-skd?tab=readme-ov-file) |
- **To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning** - 11 | [Github](https://github.com/X2FD/LVIS-INSTRUCT4V ) | [Data](https://github.com/X2FD/LVIS-INSTRUCT4V) |
- **ILuvUI: Instruction-tuned LangUage-Vision modeling of UIs from Machine Conversations** - 10 |
- **NExT-GPT: Any-to-Any Multimodal LLM** - 09 | [Github](https://github.com/NExT-GPT/NExT-GPT) | [Data](https://github.com/NExT-GPT/NExT-GPT)|
- **SVIT: Scaling up Visual Instruction Tuning** - 07 | [Github](https://github.com/BAAI-DCAI/Visual-Instruction-Tuning) | [Data](https://huggingface.co/datasets/BAAI/SVIT)|
- **StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data** - 08 | [Github](https://github.com/icoz69/StableLLAVA?tab=readme-ov-file) | [Data](https://github.com/icoz69/StableLLAVA?tab=readme-ov-file)|
- **PointLLM: Empowering Large Language Models to Understand Point Clouds** - 08 | [Github](https://github.com/OpenRobotLab/PointLLM) | [Data](https://huggingface.co/datasets/RunsenXu/PointLLM/tree/main)|
- **ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning** - 07 |
- **Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic** - 06 | [Github](https://github.com/shikras/shikra) | [Data](https://github.com/shikras/shikra/blob/main/docs/data.md)
- **Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning** - 06 | [Github](https://github.com/FuxiaoLiu/LRV-Instruction) | [Data](https://github.com/FuxiaoLiu/LRV-Instruction?tab=readme-ov-file) |
- **Valley: Video Assistant with Large Language model Enhanced abilitY** - 06 | [Github](https://github.com/RupertLuo/Valley) | [Data](https://huggingface.co/datasets/luoruipu1/Valley-Instruct-73k)|
- **DetGPT: Detect What You Need via Reasoning** - 05 | [Github](https://detgpt.github.io) |
- **Miko: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery** - 02 |
- **ILuvUI: Instruction-tuned LangUage-Vision modeling of UIs from Machine Conversations** - 10 |
-
-
KD Algorithms
-
Distillation Algorithms
- **LLM-QAT: Data-Free Quantization Aware Training for Large Language Models** - 05 | [Github](https://github.com/facebookresearch/LLM-QAT)| [Data](https://github.com/facebookresearch/LLM-QAT)|
- **Less is more: Task-aware layer-wise distillation for language model compression** - 10 | [Github](https://github.com/cliang1453/task-aware-distillation)
- **Knowledge Distillation for Closed-Source Language Models** - 01 |
- **Knowledge Fusion of Large Language Models** - 01 | [Github](https://github.com/fanqiwan/FuseLLM )
- **Improving In-context Learning via Bidirectional Alignment** - 12
- **BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation** - 02 | [Github](https://github.com/DD-DuDa/BitDistiller) |
- **Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering** - 03 |
- **Large Language Models Can Self-Improve** - 10
- **PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning** - 02 | [Github](https://github.com/gmkim-ai/PromptKD) | [Data](https://github.com/gmkim-ai/PromptKD/tree/main/data_utils)
- **Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models** - 04 |
- **Weight-Inherited Distillation for Task-Agnostic BERT Compression** - 03 | [Github](https://github.com/wutaiqiang/WID-NAACL2024) |
- **DISTILLM: Towards Streamlined Distillation for Large Language Models** - 02 | [Github](https://github.com/jongwooko/distillm) |
- **Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs** - 02 | [Github](https://github.com/Nicolas-BZRD/llm-recipes) | [Data](https://huggingface.co/Nicolas-BZRD)|
- **Baby Llama: Knowledge Distillation from an Ensemble of Teachers Trained on a Small Dataset with No Performance Penalty** - 08 | [Github](https://github.com/timinar/BabyLlama) | [Data](https://github.com/timinar/BabyLlama )|
- **f-Divergence Minimization for Sequence-Level Knowledge Distillation** - 07 | [Github](https://github.com/MANGA-UOFA/fdistill) | [Data](https://drive.google.com/file/d/1V7bPndyoTQxcJ6m1BoXAw7-ub-jv8Wh1/view?usp=sharing)|
- **MiniLLM: Knowledge Distillation of Large Language Models** - 06 | [Github](https://github.com/microsoft/LMOps/tree/main/minillm) | [Data](https://github.com/microsoft/LMOps/tree/main/minillm) |
- **LLM-QAT: Data-Free Quantization Aware Training for Large Language Models** - 05 | [Github](https://github.com/facebookresearch/LLM-QAT)| [Data](https://github.com/facebookresearch/LLM-QAT)|
- **Less is more: Task-aware layer-wise distillation for language model compression** - 10 | [Github](https://github.com/cliang1453/task-aware-distillation)
- **DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter** - 10
- **Direct Language Model Alignment from Online AI Feedback** - 02 |
- **Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint** - 01 | [Github](https://github.com/RUCAIBox/RLMEC)
- **Aligning Large Language Models through Synthetic Feedback** - 05 | [Github](https://github.com/naver-ai/almost)| [Data](https://github.com/naver-ai/almost )|
- **Language Model Self-improvement by Reinforcement Learning Contemplation** - 05
- **KnowTuning: Knowledge-aware Fine-tuning for Large Language Models** - 02 | [Github](https://github.com/youganglyu/KnowTuning) |
- **Zephyr: Direct Distillation of Language Model Alignment** - 10 | [Github](https://github.com/huggingface/alignment-handbook ) | [Data](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)|
- **Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint** - 01 | [Github](https://github.com/RUCAIBox/RLMEC)
- **Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering** - 03 |
- **Self-Rewarding Language Models** - 01 | [Github](https://github.com/lucidrains/self-rewarding-lm-pytorch?tab=readme-ov-file )|
- **Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models** - 01 | [Github](https://github.com/uclaml/SPIN) | [Data](https://huggingface.co/datasets/UCLA-AGI/SPIN_iter0)|
- **Consitutional AI: Harmlessness from AI Feedback** - 12 |
- **Weight-Inherited Distillation for Task-Agnostic BERT Compression** - 03 | [Github](https://github.com/wutaiqiang/WID-NAACL2024) |
- **BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation** - 02 | [Github](https://github.com/DD-DuDa/BitDistiller) |
- **Language Model Self-improvement by Reinforcement Learning Contemplation** - 05
- **Zephyr: Direct Distillation of Language Model Alignment** - 10 | [Github](https://github.com/huggingface/alignment-handbook ) | [Data](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)|
- **CycleAlign: Iterative Distillation from Black-box LLM to White-box Models for Better Human Alignment** - 10
- **STaR: Bootstrapping Reasoning With Reasoning** - 03 | [Github](https://github.com/ezelikman/STaR)|
- **Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models** - 04 |
- **Knowledge Distillation for Closed-Source Language Models** - 01 |
- **f-Divergence Minimization for Sequence-Level Knowledge Distillation** - 07 | [Github](https://github.com/MANGA-UOFA/fdistill) | [Data](https://drive.google.com/file/d/1V7bPndyoTQxcJ6m1BoXAw7-ub-jv8Wh1/view?usp=sharing)|
- **DISTILLM: Towards Streamlined Distillation for Large Language Models** - 02 | [Github](https://github.com/jongwooko/distillm) |
- **Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs** - 02 | [Github](https://github.com/Nicolas-BZRD/llm-recipes) | [Data](https://huggingface.co/Nicolas-BZRD)|
- **Towards the Fundamental Limits of Knowledge Transfer over Finite Domains** - 10 |
- **Baby Llama: Knowledge Distillation from an Ensemble of Teachers Trained on a Small Dataset with No Performance Penalty** - 08 | [Github](https://github.com/timinar/BabyLlama) | [Data](https://github.com/timinar/BabyLlama )|
- **MiniLLM: Knowledge Distillation of Large Language Models** - 06 | [Github](https://github.com/microsoft/LMOps/tree/main/minillm) | [Data](https://github.com/microsoft/LMOps/tree/main/minillm) |
- **Improving In-context Learning via Bidirectional Alignment** - 12
- **Large Language Models Can Self-Improve** - 10
- **KnowTuning: Knowledge-aware Fine-tuning for Large Language Models** - 02 | [Github](https://github.com/youganglyu/KnowTuning) |
-
Knowledge Elicitation
- **Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes** - 05 | [Github](https://github.com/google-research/distilling-step-by-step)| [Data](https://github.com/google-research/distilling-step-by-step)|
- **Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step** - 06 |
- **GPT-4All: Training an Assistant-Style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo** - | 2023-03 | [Github](https://github.com/nomic-ai/gpt4all)|
- **Specializing Smaller Language Models towards Multi-Step Reasoning** - 01 |
- **Large Language Models Are Reasoning Teachers** - 12 | [Github](https://github.com/itsnamgyu/reasoning-teacher)| [Data](https://github.com/itsnamgyu/reasoning-teacher)|
- **Teaching Small Language Models to Reason** - 12 |
- **Explanations from Large Language Models Make Small Reasoners Better** - 10 |
- **V-STaR: Training Verifiers for Self-Taught Reasoners** - 02
- **Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation** - 01 | [Github](https://github.com/Zheng0428/COIG-Kun) | [Data](https://huggingface.co/datasets/m-a-p/COIG-Kun)|
- **An Empirical Study of Instruction-tuning Large Language Models in Chinese** - 10 | [Github](https://github.com/PhoebusSi/Alpaca-CoT)| [Data](https://huggingface.co/datasets/QingyiSi/Alpaca-CoT)|
- **WizardCoder: Empowering Code Large Language Models with Evol-Instruct** - 06 | [Github](https://github.com/nlpxucan/WizardLM) |
- **Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases** - 03 | [Github](https://github.com/LianjiaTech/BELLE) | [Data](https://huggingface.co/BelleGroup)|
- **Symbolic Knowledge Distillation: from General Language Models to Commonsense Models** - 10 | [Github](https://github.com/peterwestai2/symbolic-knowledge-distillation) | [Data](https://github.com/peterwestai2/symbolic-knowledge-distillation)|
- **Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs** - 03 |
- **DB-LLM: Accurate Dual-Binarization for Efficient LLMs** - 02 |
- **Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment** - 11 |
- **Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization** - 10 | [Github](https://github.com/swarnaHub/ExplanationIntervention) |
- **Language to Rewards for Robotic Skill Synthesis** - 06 | [Github](https://github.com/google-deepmind/language_to_reward_2023)|
- **Lion: Adversarial Distillation of Closed-Source Large Language Model** - 05 | [Github](https://github.com/YJiangcm/Lion)|
- **APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference** - 01 |
- **GRATH: Gradual Self-Truthifying for Large Language Models** - 01 |
- **Beyond human data: Scaling self-training for problem-solving with language models** - 12
- **Self-Knowledge Guided Retrieval Augmentation for Large Language Models** - 10 | [Github](https://github.com/THUNLP-MT/SKR) |
- **RAIN: Your Language Models Can Align Themselves without Finetuning** - 09 | [Github](https://github.com/SafeAILab/RAIN)
- **Reinforced Self-Training (ReST) for Language Modeling** - 08
- **Humback: Self-Alignment with Instruction Backtranslation** - 08 | [Github](https://github.com/Spico197/Humback)
- **Self-Alignment of Large Language Models via Reinforcement Learning from Contrast Distillation** - 07 | [Github](https://github.com/facebookresearch/rlcd)|
- **Self-Improvement of Large Language Models via Reinforcement Learning from Human Feedback** - 06 |
- **An Empirical Study of Instruction-tuning Large Language Models in Chinese** - 10 | [Github](https://github.com/PhoebusSi/Alpaca-CoT)| [Data](https://huggingface.co/datasets/QingyiSi/Alpaca-CoT)|
- **Symbolic Knowledge Distillation: from General Language Models to Commonsense Models** - 10 | [Github](https://github.com/peterwestai2/symbolic-knowledge-distillation) | [Data](https://github.com/peterwestai2/symbolic-knowledge-distillation)|
- **Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment** - 11 |
- **Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs** - 03 |
- **DB-LLM: Accurate Dual-Binarization for Efficient LLMs** - 02 |
- **Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation** - 01 | [Github](https://github.com/Zheng0428/COIG-Kun) | [Data](https://huggingface.co/datasets/m-a-p/COIG-Kun)|
- **APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference** - 01 |
- **Beyond human data: Scaling self-training for problem-solving with language models** - 12
- **Language to Rewards for Robotic Skill Synthesis** - 06 | [Github](https://github.com/google-deepmind/language_to_reward_2023)|
- **Lion: Adversarial Distillation of Closed-Source Large Language Model** - 05 | [Github](https://github.com/YJiangcm/Lion)|
- **V-STaR: Training Verifiers for Self-Taught Reasoners** - 02
- **Reinforced Self-Training (ReST) for Language Modeling** - 08
- **Self-Alignment of Large Language Models via Reinforcement Learning from Contrast Distillation** - 07 | [Github](https://github.com/facebookresearch/rlcd)|
- **Self-Improvement of Large Language Models via Reinforcement Learning from Human Feedback** - 06 |
-
-
Star History
-
Misc.
- 
- **ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases** - 06 | [Github](https://github.com/PKU-YuanGroup/ChatLaw) |
- **Lawyer LLaMA Technical Report** - 05 | [Github](https://github.com/AndrewZhe/lawyer-llama) | [Data](https://github.com/AndrewZhe/lawyer-llama)|
- **ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases** - 06 | [Github](https://github.com/PKU-YuanGroup/ChatLaw) |
-
Medical & Healthcare
- **HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs** - 11 | [Github](https://github.com/FreedomIntelligence/HuatuoGPT-II) | [Data](https://huggingface.co/datasets/FreedomIntelligence/HuatuoGPT2_sft_instruct_GPT4_50K)|
- **AlpaCare: Instruction-tuned large language models for medical application** - 10 | [Github](https://github.com/xzhang97666/alpacare) | [Data](https://github.com/XZhang97666/AlpaCare/blob/master/data/MedInstruct-52k.json)|
- **DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation** - 08 | [Github](https://github.com/FudanDISC/DISC-MedLLM/tree/main) | [Data](https://huggingface.co/datasets/Flmc/DISC-Med-SFT)|
- **HuatuoGPT: Taming Language Model to Be a Doctor** - 05 | [Github](https://github.com/FreedomIntelligence/HuatuoGPT) | [Data](https://huggingface.co/datasets/FreedomIntelligence/HuatuoGPT-sft-data-v1)|
- **DoctorGLM: Fine-tuning your Chinese doctor is not a herculean task** - 04 | [Github](https://github.com/xionghonglin/DoctorGLM) | [Data](https://github.com/Toyhom/Chinese-medical-dialogue-data)|
- **Huatuo: Tuning LLM with Chinese Medical Knowledge** - 04 | [Github](https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese) |
- **MedAlpaca: An Open-Source Collection of Medical Conversational AI Models and Training Data** - 04 | [Github](https://github.com/kbressem/medAlpaca) | [Data](https://github.com/kbressem/medAlpaca)
- **PMC-LLaMA: Further Finetuning LLaMA on Medical Papers** - 04 | [Github](https://github.com/chaoyi-wu/PMC-LLaMA) | [Data](https://huggingface.co/datasets/axiong/pmc_llama_instructions)|
- **ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge** - 03 | [Github](https://github.com/Kent0n-Li/ChatDoctor) |
- **ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge** - 03 | [Github](https://github.com/Kent0n-Li/ChatDoctor) |
- **HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs** - 11 | [Github](https://github.com/FreedomIntelligence/HuatuoGPT-II) | [Data](https://huggingface.co/datasets/FreedomIntelligence/HuatuoGPT2_sft_instruct_GPT4_50K)|
- **AlpaCare: Instruction-tuned large language models for medical application** - 10 | [Github](https://github.com/xzhang97666/alpacare) | [Data](https://github.com/XZhang97666/AlpaCare/blob/master/data/MedInstruct-52k.json)|
-
Finance
-
Science
- **MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining** - 03 |
- **SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning** - 01 | [Github](https://github.com/THUDM/SciGLM) |
- **AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets** - 01
- **GeoGalactica: A Scientific Large Language Model in Geoscience** - 01 | [Github](https://github.com/geobrain-ai/geogalactica) | [Data](https://huggingface.co/datasets/daven3/geobench)
- **InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery** - 11 | [Github](https://github.com/IDEA-XL/InstructMol) |
- **LLM-Prop: Predicting Physical And Electronic Properties Of Crystalline Solids From Their Text Descriptions** - 10 | [Github](https://github.com/vertaix/LLM-Prop) |
- **OceanGPT: A Large Language Model for Ocean Science Tasks** - 10 | [Github](https://github.com/zjunlp/KnowLM) | [Data](https://huggingface.co/datasets/zjunlp/OceanBench)|
- **MarineGPT: Unlocking Secrets of Ocean to the Public** - 10 | [Github](https://github.com/hkust-vgd/MarineGPT)
- **ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving** - 09 | [Github](https://github.com/microsoft/ToRA)
- **DARWIN Series: Domain Specific Large Language Models for Natural Science** - 08 | [Github](https://github.com/MasterAI-EAM/Darwin) |
- **Biomedgpt: Open Multimodal Generative Pre-trained Transformer for Biomedicine** - 08 | [Github](https://github.com/PharMolix/OpenBioMed) | [Data](https://github.com/PharMolix/OpenBioMed)|
- **Prot2Text: Multimodal Protein’s Function Generation with GNNs and Transformers** - 07 |
- **xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein** - 07 |
- **GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning** - 06 | [Github](https://github.com/zhao-ht/GIMLET) | [Data](https://huggingface.co/datasets/haitengzhao/molecule_property_instruction)|
- **K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization** - 06 | [Github](https://github.com/davendw49/k2)
- **MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining** - 03 |
- **SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning** - 01 | [Github](https://github.com/THUDM/SciGLM) |
- **AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets** - 01
- **GeoGalactica: A Scientific Large Language Model in Geoscience** - 01 | [Github](https://github.com/geobrain-ai/geogalactica) | [Data](https://huggingface.co/datasets/daven3/geobench)
- **InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery** - 11 | [Github](https://github.com/IDEA-XL/InstructMol) |
- **LLM-Prop: Predicting Physical And Electronic Properties Of Crystalline Solids From Their Text Descriptions** - 10 | [Github](https://github.com/vertaix/LLM-Prop) |
- **MarineGPT: Unlocking Secrets of Ocean to the Public** - 10 | [Github](https://github.com/hkust-vgd/MarineGPT)
- **Mammoth: Building Math Generalist Models through Hybrid Instruction Tuning** - 09 | [Github](https://tiger-ai-lab.github.io/MAmmoTH/)| [Data](https://huggingface.co/datasets/TIGER-Lab/MathInstruct)|
- **DARWIN Series: Domain Specific Large Language Models for Natural Science** - 08 | [Github](https://github.com/MasterAI-EAM/Darwin) |
- **Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct** - 08 | [Github](https://github.com/nlpxucan/WizardLM)|
- **Biomedgpt: Open Multimodal Generative Pre-trained Transformer for Biomedicine** - 08 | [Github](https://github.com/PharMolix/OpenBioMed) | [Data](https://github.com/PharMolix/OpenBioMed)|
- **GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning** - 06 | [Github](https://github.com/zhao-ht/GIMLET) | [Data](https://huggingface.co/datasets/haitengzhao/molecule_property_instruction)|
- **Visual Instruction Tuning: A Comprehensive Study of Visual Instruction Tuning for Large Language Models** - 04 | [Github](https://github.com/haotian-liu/LLaVA) | [Data](https://github.com/haotian-liu/LLaVA/blob/main/docs/Data.md)|
-
Misc.
- **OWL: A Large Language Model for IT Operations** - 09 | [Github](https://github.com/HC-Guo/Owl) | [Data](https://github.com/HC-Guo/Owl/tree/main/OWL-Instruct/data)|
- **EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education** - 08 | [Github](https://github.com/ECNU-ICALK/EduChat) | [Data](https://huggingface.co/datasets/ecnu-icalk/educhat-sft-002-data-osm) |
-
-
Encoder-based KD
Programming Languages
Categories
Sub Categories