Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/thinkwee/awesome-llm-if

An Awesome List to LLM Instruction Following
https://github.com/thinkwee/awesome-llm-if

List: awesome-llm-if

Last synced: 15 days ago
JSON representation

An Awesome List to LLM Instruction Following

Awesome Lists containing this project

README

        



Excellent **IF (Instruction Following)** capabilities are the foundation for building complex applications (such as [Tool Usage](https://github.com/thunlp/ToolLearningPapers) or [Multi-Agent System](https://thinkwee.top/multiagent_ebook/)) based on LLMs. This repository aims to provide a comprehensive list of papers, repositories, and other resources related to improving, evaluating, benchmarking, and theoretically analyzing instruction-following capabilities, in order to advance research in this field.

The repository is still under active construction, and we welcome everyone to collaborate and contribute!

# Method
- [DO LLMS “KNOW” INTERNALLY WHEN THEY FOLLOW INSTRUCTIONS?](https://arxiv.org/pdf/2410.14516)
- Cambridge, Apple
- In submission to ICLR 2025
- [SELF-PLAY WITH EXECUTION FEEDBACK: IMPROVING INSTRUCTION-FOLLOWING CAPABILITIES OF LARGE LANGUAGE MODELS](https://arxiv.org/pdf/2406.13542)
- Alibaba
- [AutoIF](https://github.com/QwenLM/AutoIF) ![](https://img.shields.io/github/stars/QwenLM/AutoIF.svg)
- [LESS: Selecting Influential Data for Targeted Instruction Tuning](https://arxiv.org/pdf/2402.04333)
- Princeton University, University of Washington
- ICML 2024
- [LESS](https://github.com/princeton-nlp/less) ![](https://img.shields.io/github/stars/princeton-nlp/less.svg)
- [WizardLM: Empowering Large Language Models to Follow Complex Instructions](https://arxiv.org/pdf/2304.12244)
- Microsoft, Peking University
- ICLR 2024
- [WizardLM](https://github.com/nlpxucan/WizardLM) ![](https://img.shields.io/github/stars/nlpxucan/WizardLM.svg)
- [Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models](https://arxiv.org/pdf/2402.11532)
- University of Minnesota, Amazon AGI, Grammarly
- [Instruction Pre-Training: Language Models are Supervised Multitask Learners](https://arxiv.org/pdf/2406.14491)
- Microsoft Research, Tsinghua University
- [LMOps](https://github.com/microsoft/LMOps) ![](https://img.shields.io/github/stars/microsoft/LMOps.svg)

# Evaluation
- [Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators](https://arxiv.org/pdf/2404.04475)
- Stanford University, Independent Researcher
- [alpaca_eval](https://github.com/tatsu-lab/alpaca_eval) ![](https://img.shields.io/github/stars/tatsu-lab/alpaca_eval.svg)
- [INFOBENCH: Evaluating Instruction Following Ability in Large Language Models](https://arxiv.org/pdf/2401.03601)
- Tencent AI Lab, Seattle; University of Central Florida; Emory University; University of Georgia; Shanghai Jiao Tong University
- [InfoBench](https://github.com/qinyiwei/InfoBench) ![](https://img.shields.io/github/stars/qinyiwei/InfoBench.svg)
- [STRUC-BENCH: Are Large Language Models Good at Generating Complex Structured Tabular Data?](https://aclanthology.org/2024.naacl-short.2.pdf)
- Yale University, Zhejiang University, New York University
- NAACL 2024
- [Struc-Bench](https://github.com/gersteinlab/Struc-Bench) ![](https://img.shields.io/github/stars/gersteinlab/Struc-Bench.svg)
- [FOFO: A Benchmark to Evaluate LLMs’ Format-Following Capability](https://arxiv.org/pdf/2402.18667)
- Salesforce Research, University of Illinois at Chicago, Pennsylvania State University
- [FoFo](https://github.com/SalesforceAIResearch/FoFo) ![](https://img.shields.io/github/stars/SalesforceAIResearch/FoFo.svg)
- [AlignBench: Benchmarking Chinese Alignment of Large Language Models](https://arxiv.org/pdf/2311.18743)
- Tsinghua University, Zhipu AI, Renmin University of China, Sichuan University, Lehigh University
- [AlignBench](https://github.com/THUDM/AlignBench) ![](https://img.shields.io/github/stars/THUDM/AlignBench.svg)
- [Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena](https://proceedings.neurips.cc/paper_files/paper/2023/file/91f18a1287b398d378ef22505bf41832-Paper-Datasets_and_Benchmarks.pdf)
- UC Berkeley, UC San Diego, Carnegie Mellon University, Stanford, MBZUAI
- NeuralPS 2023
- [llm_judge](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge) ![](https://img.shields.io/github/stars/lm-sys/FastChat.svg)
- [Benchmarking Complex Instruction-Following with Multiple Constraints Composition](https://arxiv.org/pdf/2407.03978)
- Tsinghua, Zhipu, China University of Geosciences, Central China Normal University
- [ComplexBench](https://github.com/thu-coai/ComplexBench) ![](https://img.shields.io/github/stars/thu-coai/ComplexBench.svg)
- [EVALUATING LARGE LANGUAGE MODELS AT EVALUATING INSTRUCTION FOLLOWING](https://arxiv.org/pdf/2310.07641)
- Tsinghua, Princeton, UIUC
- ICLR 2024
- [LLMBar](https://github.com/lyogavin/Anima) ![](https://img.shields.io/github/stars/princeton-nlp/LLMBar.svg)
- [Instruction-Following Evaluation for Large Language Models](https://arxiv.org/pdf/2311.07911)
- Google, Yale
- [instruction_following_eval](https://github.com/google-research/google-research/tree/master/instruction_following_eval) ![](https://img.shields.io/github/stars/google-research/google-research.svg)
- [FollowEval: A Multi-Dimensional Benchmark for Assessing the Instruction-Following Capability of Large Language Models](https://arxiv.org/pdf/2311.09829)
- Lenovo, TJU
- [Can Large Language Models Understand Real-World Complex Instructions?](https://arxiv.org/pdf/2309.09150)
- Fudan, ECNU
- AAAI 2024
- [CELLO](https://github.com/Abbey4799/CELLO) ![](https://img.shields.io/github/stars/Abbey4799/CELLO.svg)
- [FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models](https://arxiv.org/pdf/2310.20410)
- HKUST, Huawei
- ACL 2024
- [FollowBench](https://github.com/YJiangcm/FollowBench) ![](https://img.shields.io/github/stars/YJiangcm/FollowBench.svg)
- [Evaluating Large Language Models on Controlled Generation Tasks](https://arxiv.org/pdf/2310.14542)
- USC, UC, ETH, Amazon, Deepmind
- [llm-controlgen](https://github.com/sunjiao123sun/llm-controlgen) ![](https://img.shields.io/github/stars/sunjiao123sun/llm-controlgen.svg)

# Contributors
图片描述
图片描述
图片描述