https://github.com/RenzeLou/awesome-instruction-learning

Papers and Datasets on Instruction Tuning and Following. ✨✨✨
https://github.com/RenzeLou/awesome-instruction-learning
awesome-list datasets in-context-learning instruction instruction-learning instruction-tuning large-language-models paper-list pretrained-language-model prompt survey
Last synced: 16 days ago
JSON representation
Papers and Datasets on Instruction Tuning and Following. ✨✨✨
Host: GitHub
URL: https://github.com/RenzeLou/awesome-instruction-learning
Owner: RenzeLou
License: mit
Created: 2023-02-21T01:43:05.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-04-04T19:48:35.000Z (about 1 year ago)
Last Synced: 2025-05-08T03:01:46.582Z (21 days ago)
Topics: awesome-list, datasets, in-context-learning, instruction, instruction-learning, instruction-tuning, large-language-models, paper-list, pretrained-language-model, prompt, survey
Language: Python
Homepage: https://arxiv.org/abs/2303.10475
Size: 6.25 MB
Stars: 493
Watchers: 7
Forks: 24
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

Awesome-instruction-tuning - awesome-instruction-learning
ultimate-awesome - awesome-instruction-learning - Papers and Datasets on Instruction Tuning and Following. ✨✨✨. (Other Lists / Julia Lists)
README

        
 Awesome Instruction Learning 




  

  





  

  

  





🔥🔥🔥 An awesome reading list of Instruction Tuning and Following, including papers and datasets. 





 👉 Explore our latest survey update! Feel free to dive in and discover the improvements we've made 👀 🤗 :  Latest Survey  



---

## ❤️ Contribution

This repository is currently maintained by [Renze Lou](https://renzelou.github.io/) @ PennState and [Kai Zhang](https://drogozhang.github.io/) @ OhioState. **We appreciate any contributions** ❤️.

If you have any suggestions or find any missed papers, feel free to [reach out](https://outlook.office.com/mail/deeplink/compose?mailtouri=mailto%3Amarionojump0722%40gmail.com) or submit a [pull request](https://github.com/RenzeLou/awesome-instruction-learning/pulls):

1. Use following markdown format.

```markdown

**Paper Title.** *Author 1, Author 2, and Author 3.* Conference/Journal/Preprint Year. [[pdf](link)]; [[other resources](link)].

```

2. If one preprint paper has multiple versions, please use **the earliest submitted year**.

   

3. Display the papers in **a year descending order** (the latest, the first).

## 🥳 Citation

Find this repository helpful? 😊😊😊  

Please consider citing our paper. 👇👇👇

```

@article{lou2023instruction,

  title={A Comprehensive Survey on Instruction Following},

  author={Lou, Renze and Zhang, Kai and Yin, Wenpeng},

  journal={arXiv preprint arXiv:2303.10475},

  year={2023}

}

```

---

## 🔍 Table of Contents 

- [1. 💁🏽‍♀️ Introduction](#1-️-introduction)

- [2. 🎓 Surveys and Tutorials](#2--surveys-and-tutorials)

- [3. 📚 Corpora](#3--corpora)

- [4. 🗂️ Taxonomy](#4-️-taxonomy)

  - [4.1 Entailment-oriented Instruction](#41-entailment-oriented-instruction)

  - [4.2 PLM-oriented Instruction](#42-plm-oriented-instruction)

  - [4.3 Human-oriented Instruction](#43-human-oriented-instruction)

- [5. 📊 Analyses](#5--analyses)

  - [5.1 Scale](#51-scale)

  - [5.2 Explanability](#52-explanability)

  - [5.3 Robustness and Safety](#53-robustness-and-safety)

  - [5.4 Evaluation](#54-evaluation)

  - [5.5 Negation](#55-negation)

  - [5.6 Complexity](#56-complexity)

  - [5.7 Other Papers](#57-other-papers)

- [6. 🤖 Applications](#6--applications)

  - [6.1 Human-Computer Interaction](#61-human-computer-interaction)

  - [6.2 Data and Feature Augmentation](#62-data-and-feature-augmentation)

  - [6.3 General-purpose Language Models](#63-general-purpose-language-models)

  - [6.4 Other Papers](#64-other-papers)

- [7. 📖 Extended Reading](#7--extended-reading)

  - [7.1 Instruction Induction](#71-instruction-induction)

  - [7.2 ChatGPT-related Papers](#72-chatgpt-related-papers)

  - [7.3 Human Feedback vs. Model Feedback](#73-human-feedback-vs-model-feedback)

  - [7.4 Scalable Oversight and Alignment](#74-scalable-oversight-and-alignment)

  - [7.5 Other Papers](#75-other-papers)

---

## 1. 💁🏽‍♀️ Introduction







Why *instruction-driven* learning instead of *example-driven* learning?

- 👉 **Affordable.**  For the conventional example-driven supervised learning, each *downstream* task usually requires extensive labeled examples 💰. While for instruction learning, each *downstream* task may require only one instruction and just a few examples 🤩.

- 👉 **One model, all tasks.** An ideal AI system should be able to quickly understand and handle various new tasks 💫.

- 👉 **A promising research direction.** Traditional example-driven supervised learning uses labeled instances to represent the task semantics, i.e., training models by observing numerous examples to recover the original task meaning. Therefore, **why not directly use the task instruction**, **which has already occupied the essential task semantics**?

## 2. 🎓 Surveys and Tutorials

We use the label ![comprehensive](https://img.shields.io/badge/comprehensive-FFA07A) to denote the papers with a more comprehensive perspective. While some other papers are more specific to a certain in-context instruction, including ![prompt](https://img.shields.io/badge/prompt-90EE90), few-shot ![in-context demonstrations](https://img.shields.io/badge/demonstrations-FFB6C1), and CoT ![reasoning](https://img.shields.io/badge/reasoning-9cf).

1. **A Comprehensive Survey on Instruction Following.** *Renze Lou, Kai Zhang, and Wenpeng Yin.* Preprint 2023. [[pdf](https://arxiv.org/abs/2303.10475)]; [[paper list](https://github.com/RenzeLou/awesome-instruction-learning)]. ![comprehensive](https://img.shields.io/badge/comprehensive-FFA07A)

   

2. **Learning from Task Instructions.** *Wenpeng Yin, Qinyuan Ye, Pengfei Liu, Xiang Ren, and Hinrich Schütze.* EMNLP Tutorial 2023. [[pdf](https://aclanthology.org/2023.emnlp-tutorial.4.pdf)]. ![comprehensive](https://img.shields.io/badge/comprehensive-FFA07A)

   

3. **Nature Language Reasoning, A Survey.** *Fei Yu, Hongbo Zhang, and Benyou Wang.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2303.14725.pdf)]; [[paper list](https://github.com/FreedomIntelligence/ReasoningNLP)]. ![reasoning](https://img.shields.io/badge/reasoning-9cf)

4. **Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.** *Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig.* ACM Computing Surveys 2023. [[pdf](https://dl.acm.org/doi/pdf/10.1145/3560815)]; [[website](http://pretrain.nlpedia.ai/)]. ![prompt](https://img.shields.io/badge/prompt-90EE90)

   

5. **A Survey on In-context Learning**. *Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, Lei Li, and Zhifang Sui*. Preprint 2022. [[pdf](https://arxiv.org/pdf/2301.00234.pdf)]. ![in-context demonstrations](https://img.shields.io/badge/demonstrations-FFB6C1)

   

6. **Towards Reasoning in Large Language Models: A Survey.** *Jie Huang, and Kevin Chen-Chuan Chang.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.10403.pdf)]; [[paper list](https://github.com/jeffhj/LM-reasoning)]. ![reasoning](https://img.shields.io/badge/reasoning-9cf)

7. **Reasoning with Language Model Prompting: A Survey.** *Shuofei Qiao, Yixin Ou, Ningyu Zhang, Xiang Chen, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, and Huajun Chen.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.09597.pdf)]; [[paper list](https://github.com/zjunlp/Prompt4ReasoningPapers)]. ![reasoning](https://img.shields.io/badge/reasoning-9cf)

## 3. 📚 Corpora

**The high-quality dataset is the key factor for successful instruction tuning**. Therefore, we put the "corpora" section here to emphasize its importance.

We carefully design the following table, make it easy to be referred to, and keep it up-to-date. Hope it can contribute to future research of instruction tuning. 🤗

 *(Some rows come from [Longpre et al.](https://arxiv.org/pdf/2301.13688.pdf), thanks for their great work ❤️.)* 

Name 

Release

Data/Code

Scale

Language

Annotator

#Tasks

#Ins. (K)

UnifiedQA

05/2020

Link

46

750



✍ Human

CrossFit

04/2021

Link

159

71,000



✍ Human

Natural Inst. v1

04/2021

Link

61

620



✍ Human

Flan 2021

09/2021

Link

62

4,400



✍ Human

P3

10/2021

Link

62

12,000



✍ Human

MetaICL

10/2021

Link

142

3,500



✍ Human

ExMix

11/2021

Link

107

500



✍ Human

SuperNI

(Natural Inst. v2)


04/2022

Link

1,613

5,000



✍ Human

GLM

10/2022

Link

77

12,000



✍ Human

Flan 2022

10/2022

Link

1,836

15,000



✍ Human

xP3

11/2022

Link

71

81,000



✍ Human

Unnatural Inst.

12/2022

Link

117

64



🤖 InstructGPT₀₀₂

_{text-davinci-002}


Self-Instruct

12/2022

Link

/

82



🤖 GPT-3 

_davinci


OPT-IML

12/2022

/

2,207

18,000



✍ Human

Alpaca

03/2023

Link

/

52



🤖 InstructGPT₀₀₃

_{text-davinci-003}


Baize

04/2023

Link

/

100





🤖 ChatGPT


Koala

04/2023

/

/

/





✍ Human

🤖 ChatGPT


GPT4All

04/2023

Link

/

808





✍ Human

🤖 ChatGPT


Alpaca-gpt4

04/2023

Link

/

113



🤖 GPT-4 

_gpt-4


Vicuna

04/2023

/

/

76





✍ Human

🤖 ChatGPT


Dolly

04/2023

Link

/

15



✍ Human

Oasst

04/2023

Link

/

84





✍ Human

LongForm

04/2023

Link

/

27



✍ Human

🤖 InstructGPT₀₀₃

_{text-davinci-003}


Symbolic-Instruct

04/2023

Link

/

796



✍ Human

Synthetic Examples


LaMini

04/2023

Link

/

2,580



🤖 ChatGPT


WizardLM

04/2023

Link

/

196



🤖 ChatGPT


COEDIT

05/2023

Link

/

82



✍ Human


UltraChat

05/2023

Link

/

1,500





🤖 ChatGPT


CoT Collection

05/2023

Link

1,060

1,880



🤖 Codex


Dynosaur

05/2023

Link

5,740

801



🤖 ChatGPT


MUFFIN

10/2023

Link

/

68



🤖 ChatGPT

🤖 GPT-4 

✍ Human


Dynamics-of-Instruction

10/2023

Link

/

40



✍ Human


CoachLM

11/2023

Link

/

2



✍ Human


DEITA

12/2023

Link

/

10



🤖 ChatGPT


WaveCoder

12/2023

Link

4 code-related tasks

20



🤖 ChatGPT

🤖 GPT-4


Conifer

04/2024

Link

/

13



🤖 GPT-4


## 4. 🗂️ Taxonomy

In our paper, we divide the textual instructions into three categories.

### 4.1 Entailment-oriented Instruction

![entailment_oriented](./resources/entailment_oriented.png)

Entailment-oriented instruction regards the task **input** as the **premise**, and constructs the task **output** into the **hypothesis**. It unifies the conventional classification problems into a textual entailment paradigm.

1. **A Universal Discriminator for Zero-Shot Generalization.** *Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, and Zhilin Yang.* ACL 2023. [[pdf](https://arxiv.org/pdf/2211.08099.pdf)]; [[code](https://github.com/Rafa-zy/UD)].

   

2. **ConEntail: An Entailment-based Framework for Universal Zero and Few Shot Classification with Supervised Contrastive Pretraining.** *Ranran Haoran Zhang, Aysa Xuemo Fan, and Rui Zhang.* EACL 2023. [[pdf](https://arxiv.org/pdf/2210.07587.pdf)]; [[code](https://github.com/psunlpgroup/ConEntail)].

   

3. **OpenStance: Real-world Zero-shot Stance Detection.** *Hanzi Xu, Slobodan Vucetic, and Wenpeng Yin.* CoNLL 2022. [[pdf](https://arxiv.org/pdf/2210.14299.pdf)]; [[code](https://github.com/xhz0809/OpenStance)].

   

4. **Ultra-fine Entity Typing with Indirect Supervision from Natural Language Inference.** *Bangzheng Li, Wenpeng Yin, and Muhao Chen.* TACL 2022. [[pdf](https://aclanthology.org/2022.tacl-1.35.pdf)]; [[code](https://github.com/luka-group/lite)]. 

   

5. **Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning.** *Oscar Sainz, Itziar Gonzalez-Dios, Oier Lopez de Lacalle, Bonan Min, and Eneko Agirre.* Findings of NAACL 2022. [[pdf](https://aclanthology.org/2022.findings-naacl.187.pdf)]; [[code](https://github.com/luka-group/lite)].

6. **Label Verbalization and Entailment for Effective Zero and Few-Shot Relation Extraction.** *Oscar Sainz, Oier Lopez de Lacalle, Gorka Labaka, Ander Barrena, and Eneko Agirre.* EMNLP 2021. [[pdf](https://aclanthology.org/2021.emnlp-main.92.pdf)]; [[code](https://github.com/osainz59/Ask2Transformers)].

7. **Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections.** *Ruiqi Zhong, Kristy Lee, Zheng Zhang, and Dan Klein.* Findings of EMNLP 2021. [[pdf](https://aclanthology.org/2021.findings-emnlp.244.pdf)]; [[code](https://github.com/ruiqi-zhong/Meta-tuning)]. 

   

8. **Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System.** *Congying Xia, Wenpeng Yin, Yihao Feng, and Philip Yu.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.106.pdf)]; [[code](https://github.com/congyingxia/IncrementalFSTC)].

   

9.  **ExpBERT: Representation Engineering with Natural Language Explanations.** *Shikhar Murty, Pang Wei Koh, and Percy Liang.* ACL 2020. [[pdf](https://aclanthology.org/2020.acl-main.190.pdf)]; [[code](https://github.com/MurtyShikhar/ExpBERT)].

   

10. **Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach.** *Wenpeng Yin, Jamaal Hay, Dan Roth* *.* EMNLP 2019. [[pdf](https://arxiv.org/pdf/1909.00161.pdf)]; [[website](https://cogcomp.seas.upenn.edu/page/publication_view/883)].

### 4.2 PLM-oriented Instruction

![plm_oriented](./resources/PLM_oriented.png)

PLM-oriented instruction (i.e., prompt) aims to construct a cloze-style input to steer pre-trained language models (PLM) for responses. Here, we diaplay several representative works of PLM-oriented instruction learning. For more works, please refer to [this repository](https://github.com/thunlp/PromptPapers) and [this survey](https://dl.acm.org/doi/pdf/10.1145/3560815).

1. **How Does In-Context Learning Help Prompt Tuning?** *Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, and Mohit Iyyer.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.11521.pdf)]. 

   

2. **Demystifying Prompts in Language Models via Perplexity Estimation.** *Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, and Luke Zettlemoyer.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.04037.pdf)]. 

   

3. **RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning.** *Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, and et al.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2205.12548.pdf)]; [[code](https://github.com/mingkaid/rl-prompt)]. 

   

4. **PPT: Pre-trained Prompt Tuning for Few-shot Learning.** *Yuxian Gu, Xu Han, Zhiyuan Liu, and Minlie Huang.* ACL 2022. [[pdf](https://arxiv.org/pdf/2109.04332.pdf)]; [[code](https://github.com/thu-coai/PPT)]. 

   

5. **P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks.** *Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, and Jie Tang.* ACL 2022. [[pdf](https://arxiv.org/pdf/2110.07602.pdf)]; [[code](https://github.com/THUDM/P-tuning-v2)].

   

6. **KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction.** *Xiang Chen, Ningyu Zhang, Xin Xie, and et al.* WWW 2022. [[pdf](http://128.84.21.203/pdf/2104.07650)]; [[code](https://github.com/zjunlp/KnowPrompt)].

   

7. **GPT Understands, Too.** *Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, and Jie Tang.* Preprint 2021. [[pdf](https://arxiv.org/pdf/2103.10385.pdf)]; [[code](https://github.com/THUDM/P-tuning)].

   

8.  **Few-Shot Text Generation with Natural Language Instructions.** *Timo Schick and Hinrich Schütze.* EMNLP 2021. [[pdf](https://aclanthology.org/2021.emnlp-main.32.pdf)]; [[code](https://github.com/timoschick/pet)]. 

   

9.  **It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners.** *Timo Schick and Hinrich Schütze.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.185.pdf)]; [[code](https://github.com/timoschick/pet)]. 

   

10. **Learning How to Ask: Querying LMs with Mixtures of Soft Prompts.** *Guanghui Qin and Jason Eisner.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.410.pdf)]; [[code](https://github.com/hiaoxui/soft-prompts)]. 

   

11. **Prefix-Tuning: Optimizing Continuous Prompts for Generation.** *Xiang Lisa Li and Percy Liang.* ACL 2021. [[pdf](https://aclanthology.org/2021.acl-long.353.pdf)]; [[code](https://github.com/XiangLi1999/PrefixTuning)]. 

   

12. **Making Pre-trained Language Models Better Few-shot Learners.** *Tianyu Gao, Adam Fisch, and Danqi Chen.* ACL 2021. [[pdf](https://aclanthology.org/2021.acl-long.295.pdf)]; [[code](https://github.com/princeton-nlp/LM-BFF)]. 

   

13. **Template-Based Named Entity Recognition Using BART.** *Leyang Cui, Yu Wu, Jian Liu, Sen Yang, and Yue Zhang.* Findings of ACL 2021. [[pdf](https://aclanthology.org/2021.findings-acl.161.pdf)]; [[code](https://github.com/Nealcly/templateNER)]. 

   

14. **Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference.** *Timo Schick and Hinrich Schütze.* EACL 2021. [[pdf](https://aclanthology.org/2021.eacl-main.20.pdf)]; [[code](https://github.com/timoschick/pet)].

   

15. **Language Models are Unsupervised Multitask Learners.** *Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever.* Preprint 2019. [[pdf](https://life-extension.github.io/2020/05/27/GPT%E6%8A%80%E6%9C%AF%E5%88%9D%E6%8E%A2/language-models.pdf)]. 

### 4.3 Human-oriented Instruction

![Human-oriented Instruction](./resources/human_oriented.png)

Human-oriented instruction is initially designed for human to understand the task and annotate the data, such as the [Amazon MTurk](https://www.mturk.com/) Instructions, which provides sufficient information about the task (e.g., detailed definition).

   

1. **Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors.** *Kai Zhang, Bernal Jiménez Gutiérrez, and Yu Su.* Findings of ACL 2023. [[pdf](https://arxiv.org/pdf/2305.11159.pdf)]; [[code](https://github.com/OSU-NLP-Group/QA4RE)].

   

2. **Symbol tuning improves in-context learning in language models.** *Jerry Wei, Le Hou, Andrew Lampinen, Xiangning Chen, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.08298.pdf)].

   

3. **Small Models are Valuable Plug-ins for Large Language Models.** *Canwen Xu, Yichong Xu, Shuohang Wang, Yang Liu, Chenguang Zhu, and Julian McAuley.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.08848.pdf)]; [[code](https://github.com/JetRunner/SuperICL)].

   

4. **How Many Data Samples is an Additional Instruction Worth?** *Ravsehaj Singh Puri, Swaroop Mishra, Mihir Parmar, and Chitta Baral.* Findings of EACL 2023. [[pdf](https://arxiv.org/pdf/2203.09161.pdf)]; [[code](https://github.com/Ravsehajsinghpuri/Multi-Variant-Instructions)].

   

5. **In-Context Instruction Learning.** *Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, and Minjoon Seo.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.14691.pdf)]; [[code](https://github.com/seonghyeonye/ICIL)]. 

   

6. **InstructABSA: Instruction Learning for Aspect Based Sentiment Analysis.** *Kevin Scaria, Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, and Chitta Baral.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.08624.pdf)]; [[code](https://github.com/kevinscaria/InstructABSA)].

   

7. **HINT: Hypernetwork Instruction Tuning for Efficient Zero-Shot Generalisation.** *Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, and Matthew Peters.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.10315.pdf)].

8. **Boosting Natural Language Generation from Instructions with Meta-Learning.** *Budhaditya Deb, Guoqing Zheng, and Ahmed Hassan Awadallah.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2210.11617.pdf)]. 

   

9.  **GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models.** *Archiki Prasad, Peter Hase, Xiang Zhou, and Mohit Bansal.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2203.07281.pdf)]; [[code](https://github.com/archiki/GrIPS)].

   

10. **ConTinTin: Continual Learning from Task Instructions.** *Wenpeng Yin, Jia Li, and Caiming Xiong.* ACL 2022. [[pdf](https://aclanthology.org/2022.acl-long.218.pdf)]. 

   

11. **InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning.** *Prakhar Gupta, Cathy Jiao, Yi-Ting Yeh, Shikib Mehri, Maxine Eskenazi, and Jeffrey P. Bigham.* EMNLP 2022. [[pdf]([link](http://128.84.21.203/pdf/2205.12673))]; [[code](https://github.com/prakharguptaz/Instructdial)]. 

   

12. **Learning to Generate Task-Specific Adapters from Task Description.** *Qinyuan Ye and Xiang Ren.* ACL 2021. [[pdf](https://aclanthology.org/2021.acl-short.82.pdf)]; [[code](https://github.com/INK-USC/hypter)]. 

   

13. **The Turking Test: Can Language Models Understand Instructions?** *Avia Efrat and Omer Levy.* Preprint 2020. [[pdf](https://arxiv.org/pdf/2010.11982.pdf)]. 

## 5. 📊 Analyses

### 5.1 Scale

The model and task scale are found to be important for instruction-based fine-tuning. Basically, the larger model scale brings more benefits to the generalization, and so does the task scale. However, some works raised objections (e.g., [Jang et al.](https://arxiv.org/pdf/2302.03202.pdf) and [Wang et al.](https://arxiv.org/pdf/2210.00185.pdf)).

   

1. **Exploring the Benefits of Training Expert Language Models over Instruction Tuning.** *Joel Jang, Seungone Kim, Seonghyeon Ye, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.03202.pdf)]; [[code](https://github.com/joeljang/ELM)]. 

   

2. **The Flan Collection: Designing Data and Methods for Effective Instruction Tuning.** *Shayne Longpre, Le Hou, Tu Vu, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2301.13688.pdf)]; [[code](https://github.com/google-research/FLAN/tree/main/flan/v2)]; [[corpus](https://huggingface.co/datasets/SirNeural/flan_v2)].

   

3. **UL2: Unifying Language Learning Paradigms.** *Yi Tay, Mostafa Dehghani, Vinh Q. Tran, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2205.05131.pdf)]; [[checkpoint](https://huggingface.co/google/flan-ul2)].

   

4. **OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization.** *Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.12017.pdf)].   

   

5. **Scaling Instruction-Finetuned Language Models.** *Hyung Won Chung, Le Hou, Shayne Longpre, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2210.11416.pdf)]; [[checkpoint](https://huggingface.co/docs/transformers/model_doc/flan-t5)]. 

   

6. **Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization.** *Yuxian Gu, Pei Ke, Xiaoyan Zhu, and Minlie Huang.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2210.09175.pdf)]; [[code](https://github.com/thu-coai/UDIT)]. 

   

7. **Emergent Abilities of Large Language Models.** *Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, and et al.* TMLR 2022. [[pdf](https://openreview.net/pdf?id=yzkSU5zdwD)].

   

8.  **Multitask Prompted Training Enables Zero-Shot Task Generalization.** *Victor Sanh, Albert Webson, Colin Raffel, and et al.* ICLR 2022. [[pdf](https://openreview.net/pdf?id=9Vrb9D0WI4)]; [[checkpoint](https://github.com/bigscience-workshop/t-zero)]; [[corpus](https://github.com/bigscience-workshop/promptsource)]. 

   

9.  **Finetuned Language Models are Zero-Shot Learners.** *Jason Wei, Maarten Bosma, Vincent Zhao, and et al.* ICLR 2022. [[pdf](https://openreview.net/pdf?id=gEZrGCozdqR)]; [[code](https://github.com/google-research/flan)].

    

10. **Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks.** *Zhenhailong Wang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen, and Heng Ji.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2210.00185.pdf)]; [[code](https://github.com/MikeWangWZHL/Zemi)].  

    

11. **ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization.** *Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Yanggang Wang, Haiyu Li, and Zhilin Yang.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2201.06910.pdf)]. 

    

12. **The Power of Scale for Parameter-Efficient Prompt Tuning.** *Brian Lester, Rami Al-Rfou, and Noah Constant.* EMNLP 2021. [[pdf](https://aclanthology.org/2021.emnlp-main.243.pdf)]; [[code](https://github.com/google-research/prompt-tuning)]. 

### 5.2 Explanability

We exhibit works that focus on the interpretability and reliability of instruction learning, i.e., explaining *when* and *why* instruction can take effect.

   

1. **What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning.** *Jane Pan, Tianyu Gao, Howard Chen, and Danqi Chen.* Findings of ACL 2023. [[pdf](https://arxiv.org/pdf/2305.09731.pdf)]; [[code](https://github.com/princeton-nlp/WhatICLLearns)].

   

2. **REV: Information-Theoretic Evaluation of Free-Text Rationales.** *Hanjie Chen, Faeze Brahman, Xiang Ren, and et al.* ACL 2023. [[pdf](https://arxiv.org/pdf/2210.04982.pdf)]; [[code](https://github.com/HanjieChen/REV)].

   

3. **Interpretability at Scale: Identifying Causal Mechanisms in Alpaca.** *Zhengxuan Wu, Atticus Geiger, Christopher Potts, and Noah D. Goodman.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.08809.pdf)]; [[code](https://github.com/frankaging/align-transformers)].

   

4. **Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning.** *Xinyi Wang, Wanrong Zhu, Michael Saxon, Mark Steyvers, and William Yang Wang.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2301.11916.pdf)]; [[code](https://github.com/WANGXinyiLinda/concept-based-demonstration-selection)].

   

5. **The Learnability of In-Context Learning.** *Noam Wies, Yoav Levine, and Amnon Shashua.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2303.07895.pdf)].

   

6. **Why think step-by-step? Reasoning emerges from the locality of experience.** *Ben Prystawski, and Noah D. Goodman.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2304.03843.pdf)].

   

7. **Larger language models do in-context learning differently.** *Jerry Wei, Jason Wei, Yi Tay, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2303.03846.pdf)].

   

8. **What learning algorithm is in-context learning? Investigations with linear models.** *Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, and Denny Zhou.* ICLR 2023. [[pdf](https://openreview.net/pdf?id=0g0X4H8yN4I)]; [[code](https://github.com/ekinakyurek/google-research/tree/master/incontext)].

   

9.  **Can language models learn from explanations in context?** *Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, and et al.* Findings of EMNLP 2022. [[pdf](https://arxiv.org/pdf/2204.02329.pdf)]. 

   

10. **Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?** *Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2202.12837.pdf)]; [[code](https://github.com/Alrope123/rethinking-demonstrations)]. 

   

11. **Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts.** *Daniel Khashabi, Xinxi Lyu, Sewon Min, and et al.* NAACL 2022. [[pdf](https://aclanthology.org/2022.naacl-main.266.pdf)]; [[code](https://github.com/Alrope123/prompt-waywardness)]. 

   

12. **Do Prompt-Based Models Really Understand the Meaning of Their Prompts?.** *Albert Webson and Ellie Pavlick.* NAACL 2022. [[pdf](https://aclanthology.org/2022.naacl-main.167.pdf)]; [[code](https://github.com/awebson/prompt_semantics)].

   

13. **Reframing Instructional Prompts to GPTk’s Language.** *Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, and Hannaneh Hajishirzi.* Findings of ACL 2022. [[pdf](https://aclanthology.org/2022.findings-acl.50.pdf)]; [[code](https://github.com/allenai/reframing/)]. 

   

14. **What Makes Good In-Context Examples for GPT-3?** *Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen.* ACL Workshop 2022. [[pdf](https://aclanthology.org/2022.deelio-1.10.pdf)]; [[code](https://github.com/jiachangliu/KATEGPT3)]. 

   

15. **Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity.** *Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp.* ACL 2022. [[pdf](https://aclanthology.org/2022.acl-long.556.pdf)].

   

16. **Calibrate Before Use: Improving Few-shot Performance of Language Models.** *Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh.* ICML 2021. [[pdf](https://arxiv.org/pdf/2102.09690.pdf)]; [[code](https://github.com/tonyzhaozh/few-shot-learning)].

### 5.3 Robustness and Safety

   

1. **Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection.** *Jun Yan, Vikas Yadav, Shiyang Li, and et al.* Workshop @ NeurIPS 2023. [[pdf](https://arxiv.org/abs/2307.16888)].

   

2. **Evaluating the Zero-shot Robustness ofInstruction-tuned Language Models.** *Jiuding Sun, Chantal Shaib, and Byron C. Wallace.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2306.11270.pdf)].

   

3. **Poisoning Language Models During Instruction Tuning.** *Alexander Wan, Eric Wallace, Sheng Shen, and Dan Klein.* ICML 2023. [[pdf](https://arxiv.org/pdf/2305.00944.pdf)]; [[code](https://github.com/AlexWan0/Poisoning-Instruction-Tuned-Models)].

   

4. **Multi-step Jailbreaking Privacy Attacks on ChatGPT.** *Haoran Li, Dadi Guo, Wei Fan, Mingshi Xu, Jie Huang, Fanpu Meng, and Yangqiu Song.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2304.05197.pdf)].

   

5. **More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models.** *Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.12173.pdf)]; [[code](https://github.com/greshake/llm-security)]. 

   

6. **Robustness of Learning from Task Instructions.** *Jiasheng Gu, Hanzi Xu, Liangyu Nie, and Wenpeng Yin.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.03813.pdf)]. 

7. **Learning from Task Descriptions.** *Orion Weller, Nicholas Lourie, Matt Gardner, and Matthew E. Peters.* EMNLP 2020. [[pdf](https://aclanthology.org/2020.emnlp-main.105.pdf)]; [[code](https://github.com/allenai/zest)]; [[corpus](https://allenai.org/data/zest)]. 

### 5.4 Evaluation

Stop using old-school automatic metrics to evaluate your instruction-tuned system; try more advanced methods to do it comprehensively!

1. **Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2.** *Hamish Ivison, Yizhong Wang, Valentina Pyatkin, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2311.10702.pdf)]; [[model&data](https://huggingface.co/collections/allenai/tulu-v2-suite-6551b56e743e6349aab45101)]

   

2. **How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources.** *Yizhong Wang, Hamish Ivison, Pradeep Dasigi, and et al.* NeurIPS Datasets and Benchmarks 2023. [[pdf](https://arxiv.org/pdf/2306.04751.pdf)]; [[code](https://github.com/allenai/open-instruct)].

3. **Instruction-following Evaluation through Verbalizer Manipulation.** *Shiyang Li, Jun Yan, Hai Wang, Zheng Tang, Xiang Ren, Vijay Srinivasan, Hongxia Jin* Preprint 2023. [[pdf](https://arxiv.org/pdf/2307.10558.pdf)].

   

4. **INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models.** *Yew Ken Chia, Pengfei Hong, Lidong Bing, and Soujanya Poria.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2306.04757.pdf)]; [[code](https://github.com/declare-lab/instruct-eval)]; [[leaderboard](https://declare-lab.net/instruct-eval/)].

### 5.5 Negation

Negation expressions, such as `do not` and `avoid doing`, are difficult for models to corretly understand and follow.

1. **Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts.** *Joel Jang, Seonghyeon Ye, and Minjoon Seo.* ICML Workshop 2023. [[pdf](https://proceedings.mlr.press/v203/jang23a/jang23a.pdf)].

   

2. **Understanding by Understanding Not: Modeling Negation in Language Models.** *Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, and et al.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.102.pdf)]; [[code](https://github.com/arianhosseini/negation-learning)]. 

### 5.6 Complexity

Papers are focusing on enhancing the complexity of instructions to enhance model competence. More complex data in the mix of instruction data, more competent performance model could achieve.

1. **Wizardlm: Empowering large language models to follow complex instructions.** *Xu, Can and Sun, Qingfeng and Zheng, Kai and Geng, Xiubo and Zhao, Pu and Feng, Jiazhan and Tao, Chongyang and Jiang, Daxin*. Prepint 2023. [[pdf](https://arxiv.org/pdf/2304.12244.pdf)]; [[code](https://github.com/nlpxucan/WizardLM)]. 

2. **Orca: Progressive learning from complex explanation traces of gpt-4.** *Mukherjee, Subhabrata and Mitra, Arindam and Jawahar, Ganesh and Agarwal, Sahaj and Palangi, Hamid and Awadallah, Ahmed*. Prepint 2023. [[pdf](https://arxiv.org/pdf/2306.02707.pdf)].

3. **A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment.** *Zhao, Yingxiu and Yu, Bowen and Hui, Binyuan and Yu, Haiyang and Huang, Fei and Li, Yongbin and Zhang, Nevin L*. Prepint 2023. [[pdf](https://arxiv.org/pdf/2308.05696.pdf)]; [[code](https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/tree-instruct)].

### 5.7 Other Papers

   

1. **Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions.** *Mihir Parmar, Swaroop Mishra, Mor Geva, and Chitta Baral.* EACL 2023. [[pdf](https://arxiv.org/pdf/2205.00415.pdf)]; [[code](https://github.com/Mihir3009/instruction-bias)].

2. **Instruction Tuned Models are Quick Learners.** *Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2306.05539.pdf)]; [[code](https://github.com/srsawant34/efficient_instruction_learning)].

3. **Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning.** *Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin Raffel.* NeurIPS 2022. [[pdf](https://openreview.net/pdf?id=rBCvMG-JsPd)]; [[code](https://github.com/r-three/t-few)]. 

4. **A Survey of NLP-Related Crowdsourcing HITs: what works and what does not.** *Jessica Huynh, Jeffrey Bigham, and Maxine Eskenazi.* Preprint 2021. [[pdf](https://arxiv.org/pdf/2111.05241.pdf)].

   

   

   

## 6. 🤖 Applications

### 6.1 Human-Computer Interaction

Instructions are used in various human-computer interaction (HCI) tasks, such as virtual assistants, chatbots, etc. 

1. **Help me write a poem: Instruction Tuning as a Vehicle for Collaborative Poetry Writing.** *Tuhin Chakrabarty, Vishakh Padmakumar, and He He.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2210.13669.pdf)]; [[code](https://github.com/vishakhpk/creative-instructions)]. 

   

2. **HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models.** *Swaroop Mishra, and Elnaz Nouri.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2208.08232.pdf)]. 

   

3. **EditEval: An Instruction-Based Benchmark for Text Improvements.** *Jane Dwivedi-Yu, Timo Schick, Zhengbao Jiang, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2209.13331.pdf)]; [[code](https://github.com/facebookresearch/EditEval)]; [[website](https://eval.ai/web/challenges/challenge-page/1866/overview)].

   

4. **Communicating Natural Programs to Humans and Machines.** *Sam Acquaviva, Yewen Pu, Marta Kryven, and et al.* NeurIPS Workshop 2022. [[pdf](https://openreview.net/pdf?id=OxFoLTKDcNm)]; [[code](https://github.com/samacqua/LARC)]. 

   

5. **Interactive Task Learning from GUI-Grounded Natural Language Instructions and Demonstrations.** *Toby Jia-Jun Li, Tom Mitchell, and Brad Myers.* ACL Demo 2020. [[pdf](https://aclanthology.org/2020.acl-demos.25.pdf)]; [[code](https://github.com/tobyli/Sugilite_development)]; [[video](https://www.youtube.com/watch?v=tdHEk-GeaqE)].

   

6. **Multi-Modal Interactive Task Learning from Demonstrations and Natural Language Instructions.** *Toby Jia-Jun Li.* UIST 2020. [[pdf](https://dl.acm.org/doi/pdf/10.1145/3379350.3415803)]; [[code](https://github.com/tobyli/Sugilite_development)].

   

7. **Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following.** *David Gaddy, and Dan Klein.* ACL 2019. [[pdf](https://aclanthology.org/P19-1188.pdf)]. 

   

8. **VirtualHome: Simulating Household Activities via Programs.** *Xavier Puig, Kevin Ra, Marko Boben, and et al.* CVPR 2018. [[pdf](https://openaccess.thecvf.com/content_cvpr_2018/papers/Puig_VirtualHome_Simulating_Household_CVPR_2018_paper.pdf)]; [[website](http://virtual-home.org/)]. 

   

9.  **Natural Language Communication with Robots.** *Yonatan Bisk, Deniz Yuret, and Daniel Marcu.* NAACL 2016. [[pdf](https://aclanthology.org/N16-1089.pdf)]; [[website](https://groundedlanguage.github.io/)].

    

10. **Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World.** *Jayant Krishnamurthy, and Thomas Kollar.* TACL 2013. [[pdf](http://rtw.ml.cmu.edu/tacl2013_lsp/tacl2013-krishnamurthy-kollar.pdf)]; [[code](http://rtw.ml.cmu.edu/tacl2013_lsp/)]. 

11. **Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions.** *Yoav Artzi, and Luke Zettlemoyer.* TACL 2013. [[pdf](https://aclanthology.org/Q13-1005.pdf)].

    

12. **Unsupervised PCFG Induction for Grounded Language Learning with Highly Ambiguous Supervision.** *Joohyun Kim, and Raymond Mooney.* EMNLP 2012. [[pdf](https://aclanthology.org/D12-1040.pdf)].

    

13. **A joint model of language and perception for grounded attribute learning.** *Cynthia Matuszek, Nicholas FitzGerald, Luke Zettlemoyer, Liefeng Bo, and Dieter Fox.* ICML 2012. [[pdf](https://arxiv.org/pdf/1206.6423.pdf)]. 

    

14. **Learning to Interpret Natural Language Instructions.** *Monica Babeş-Vroman, James MacGlashan, Ruoyuan Gao, and et al.* ACL Workshop 2012. [[pdf](https://aclanthology.org/W12-2801.pdf)]. 

    

15. **Fast Online Lexicon Learning for Grounded Language Acquisition.** *David Chen.* ACL 2012. [[pdf](https://aclanthology.org/P12-1045.pdf)].

    

16. **Learning to Win by Reading Manuals in a Monte-Carlo Framework.** *S.R.K. Branavan, David Silver, and Regina Barzilay.* ACL 2011. [[pdf](https://aclanthology.org/P11-1028.pdf)]; [[website](http://groups.csail.mit.edu/rbg/code/civ/)].

    

17. **Learning from natural instructions.** *Dan Goldwasse, and Dan Roth.* IJCAI 2011. [[pdf](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=2aba84801935041774c1e2b749e0331efa322ed8)].  

    

18. **Learning to Interpret Natural Language Navigation Instructions from Observations.** *David L. Chen and Raymond J. Mooney.* AAAI 2011. [[pdf](https://www.cs.utexas.edu/users/ml/papers/chen.aaai11.pdf)]. 

    

19. **Approaching the Symbol Grounding Problem with Probabilistic Graphical Models.** *Stefanie Tellex, Thomas Kollar, Steven Dickerson, and et al.* AAAI 2011. [[pdf](https://cs.brown.edu/people/stellex/publications/tellex11a.pdf)]. 

    

20. **Driving Semantic Parsing from the World’s Response.** *James Clarke, Dan Goldwasser, Ming-Wei Chang, and Dan Roth.* CoNLL 2010. [[pdf](https://aclanthology.org/W10-2903.pdf)]. 

    

21. **Learning to Follow Navigational Directions.** *Adam Vogel, and Daniel Jurafsky.* ACL 2010. [[pdf](https://aclanthology.org/P10-1083.pdf)].

    

22. **Reading between the Lines: Learning to Map High-Level Instructions to Commands.** *S.R.K. Branavan, Luke Zettlemoyer, and Regina Barzilay.* ACL 2010. [[pdf](https://aclanthology.org/P10-1129.pdf)]; [[website](http://groups.csail.mit.edu/rbg/code/rl-hli/)]. 

    

23. **Reading to Learn: Constructing Features from Semantic Abstracts.** *Jacob Eisenstein, James Clarke, Dan Goldwasser, and Dan Roth.* EMNLP 2009. [[pdf](https://aclanthology.org/D09-1100.pdf)]; [[website](http://www.comlab.ox.ac.uk/activities/machinelearning/Aleph/)]. 

    

24. **Learning Semantic Correspondences with Less Supervision.** *Percy Liang, Michael Jordan, and Dan Klein.* ACL 2009. [[pdf](https://aclanthology.org/P09-1011.pdf)]. 

    

25. **Reinforcement Learning for Mapping Instructions to Actions.** *S.R.K. Branavan, Harr Chen, Luke Zettlemoyer, and Regina Barzilay.* ACL 2009. [[pdf](https://aclanthology.org/P09-1010.pdf)]; [[website](http://groups.csail.mit.edu/rbg/code/rl/)]. 

    

26. **Learning to sportscast: a test of grounded language acquisition.** *David L. Chen and Raymond J. Mooney.* ICML 2008. [[pdf](https://dl.acm.org/doi/pdf/10.1145/1390156.1390173)]. 

    

27. **Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer.** *Gregory Kuhlmann, Peter Stone, Raymond Mooney, and Jude Shavlik.* AAAI Workshop 2004. [[pdf](https://ftp.cs.wisc.edu/machine-learning/shavlik-group/kuhlmann-aaai04.pdf)]; [[website](http://www.cs.utexas.edu/AustinVilla/sim/keepaway/)]. 

### 6.2 Data and Feature Augmentation

Some instructions (e.g., label explanations) are also be used for automatic annotation (i.e., data augmentation), or for enriching feature.

1. **One Embedder, Any Task: Instruction-Finetuned Text Embeddings.** *Hongjin Su, Weijia Shi, Jungo Kasai, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.09741.pdf)]; [[website](https://instructor-embedding.github.io/)]. 

   

2. **Prompt Consistency for Zero-Shot Task Generalization.** *Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, and Graham Neubig.* Findings of EMNLP 2022. [[pdf](https://arxiv.org/pdf/2205.00049.pdf)]; [[code](https://github.com/violet-zct/swarm-distillation-zero-shot)]. 

   

3. **Teaching Machine Comprehension with Compositional Explanations.** *Qinyuan Ye, Xiao Huang, Elizabeth Boschee, and Xiang Ren.* Findings of EMNLP 2020. [[pdf](https://aclanthology.org/2020.findings-emnlp.145.pdf)]; [[code](https://github.com/INK-USC/mrc-explanation)]. 

   

4. **Learning from Explanations with Neural Execution Tree.** *Ziqi Wang, Yujia Qin, Wenxuan Zhou, Jun Yan, Qinyuan Ye, Leonardo Neves, Zhiyuan Liu, and Xiang Ren.* ICLR 2020. [[pdf](https://openreview.net/pdf?id=rJlUt0EYwS)]; [[website](http://inklab.usc.edu/project-NExT/)]. 

   

5. **Training Classifiers with Natural Language Explanations.** *Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, and Christopher Ré.* ACL 2018. [[pdf](https://aclanthology.org/P18-1175.pdf)]; [[code](https://github.com/HazyResearch/babble)]. 

   

6. **Zero-shot Learning of Classifiers from Natural Language Quantification.** *Shashank Srivastava, Igor Labutov, and Tom Mitchell.* ACL 2018. [[pdf](https://aclanthology.org/P18-1029.pdf)]. 

   

7. **Joint Concept Learning and Semantic Parsing from Natural Language Explanations.** *Shashank Srivastava, Igor Labutov, and Tom Mitchell.* EMNLP 2017. [[pdf](https://aclanthology.org/D17-1161.pdf)]. 

### 6.3 General-purpose Language Models

General-purpose language models are also one of the most attractive applications of instruction learning, e.g., [ChatGPT](https://chat.openai.com/chat), which can align nicely with human values.

1. **Sparks of Artificial General Intelligence: Early experiments with GPT-4.** *Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2303.12712.pdf)]. 

   

2. **GPT-4 Technical Report.** *OpenAI.* Preprint 2023. [[pdf](https://cdn.openai.com/papers/gpt-4.pdf)]; [[blog](https://openai.com/research/gpt-4)].  

   

3. **The Wisdom of Hindsight Makes Language Models Better Instruction Followers.** *Tianjun Zhang, Fangchen Liu, Justin Wong, Pieter Abbeel, and Joseph E. Gonzalez.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.05206.pdf)]; [[code](https://github.com/tianjunz/HIR)].

    

4. **Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models.** *Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, and Bryan Catanzaro.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.07388.pdf)]. 

   

5. **Training language models to follow instructions with human feedback.** *Long Ouyang, Jeffrey Wu, Xu Jiang, and et al.* NeurIPS 2022. [[pdf](https://openreview.net/pdf?id=TG8KACxEON)]. 

### 6.4 Other Papers

1. **GPTScore: Evaluate as You Desire.** *Jinlan Fu, See-Kiong Ng, Zhengbao Jiang, and Pengfei Liu.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.04166.pdf)]; [[code](https://github.com/jinlanfu/GPTScore)]. 

   

2. **MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning.** *Zhiyang Xu, Ying Shen, and Lifu Huang.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.10773.pdf)].

   

3. **Task-aware Retrieval with Instructions.** *Akari Asai, Timo Schick, Patrick Lewis, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2211.09260.pdf)]; [[code](https://github.com/facebookresearch/tart)]. 

   

4. **UnifiedABSA: A Unified ABSA Framework Based on Multi-task Instruction Tuning.** *Zengzhi Wang, Rui Xia, and Jianfei Yu.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2211.10986.pdf)].  

 

5. **In-Context Learning for Few-Shot Dialogue State Tracking.** *Yushi Hu, Chia-Hsuan Lee, Tianbao Xie, Tao Yu, Noah A. Smith, and Mari Ostendorf.* Findings of EMNLP 2022. [[pdf](https://arxiv.org/pdf/2203.08568.pdf)]; [[code](https://github.com/Yushi-Hu/IC-DST)]. 

   

6. **Few-shot Learning with Multilingual Language Models.** *Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, and et al.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2112.10668.pdf)]; [[code](https://github.com/facebookresearch/fairseq/tree/main/examples/xglm)].

   

7. **UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models.** *Tianbao Xie, Chen Henry Wu, Peng Shi, and et al.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2201.05966.pdf)]; [[code](https://github.com/HKUNLP/UnifiedSKG)]; [[website](https://unifiedskg.com/)]. 

   

8. **In-BoXBART: Get Instructions into Biomedical Multi-Task Learning .** *Mihir Parmar, Swaroop Mishra, Mirali Purohit, Man Luo, M. Hassan Murad, and Chitta Baral.* Findings of NAACL 2022. [[pdf](https://arxiv.org/pdf/2204.07600.pdf)]; [[code](https://github.com/Mihir3009/In-BoXBART)].

## 7. 📖 Extended Reading

We also share some other awesome papers that might inspire the future work.

### 7.1 Instruction Induction

   

1. **Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners.** *Seonghyeon Ye, Doyoung Kim, Joel Jang, Joongbo Shin, and Minjoon Seo.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2210.02969.pdf)]; [[code](https://github.com/seonghyeonye/Flipped-Learning)]. 

   

2. **Instruction Induction: From Few Examples to Natural Language Task Descriptions.** *Or Honovich, Uri Shaham, Samuel R. Bowman, and Omer Levy.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2205.10782.pdf)]; [[code](https://github.com/orhonovich/instruction-induction)].

   

3. **Learning to Decompose and Organize Complex Tasks.** *Yi Zhang, Sujay Kumar Jauhar, Julia Kiseleva, Ryen White, and Dan Roth.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.217.pdf)]; [[corpus](https://github.com/microsoft/MSComplexTasks)]. 

   

4. **Analogous Process Structure Induction for Sub-event Sequence Prediction.** *Hongming Zhang, Muhao Chen, Haoyu Wang, Yangqiu Song, and Dan Roth.* EMNLP 2020. [[pdf](https://aclanthology.org/2020.emnlp-main.119.pdf)]; [[code](https://cogcomp.github.io/APSI/)]. 

### 7.2 ChatGPT-related Papers

Nowdays, ChatGPT is a super star 🌟 in the NLP community. Since there is no official paper for ChatGPT, we share some frontier works that can provide deep insights into ChatGPT.

   

1. **When do you need Chain-of-Thought Prompting for ChatGPT?** *Jiuhai Chen, Lichang Chen, Heng Huang, and Tianyi Zhou.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2304.03262.pdf)].

   

2. **Toxicity in ChatGPT: Analyzing Persona-assigned Language Models.** *Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, and Karthik Narasimhan.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2304.05335.pdf)].

   

3. **Is ChatGPT a General-Purpose Natural Language Processing Task Solver?** *Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, and Diyi Yang.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.06476.pdf)].

   

4. **How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection.** *Biyang Guo, Xin Zhang, Ziyuan Wang, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2301.07597.pdf)]; [[corpus](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)]. 

   

5. **ChatGPT: Jack of all trades, master of none.** *Jan Kocoń, Igor Cichecki, Oliwier Kaszyca, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.10724.pdf)].

   

6. **On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective.** *Jindong Wang, Xixu Hu, Wenxin Hou, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.12095.pdf)]; [[code](https://github.com/microsoft/robustlearn)]. 

### 7.3 Human Feedback vs. Model Feedback

   

1. **Aligning Large Language Models through Synthetic Feedback.** *Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, and Minjoon Seo.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.13735.pdf)].

   

2. **LIMA: Less Is More for Alignment.** *Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.11206.pdf)].

   

3. **Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision.** *Zhiqing Sun, Yikang Shen, Qinhong Zhou, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.03047.pdf)]; [[code](https://github.com/IBM/Dromedary)].

   

4. **Chain of Hindsight Aligns Language Models with Feedback.** *Hao Liu, Carmelo Sferrazza, and Pieter Abbeel.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.02676.pdf)]; [[code](https://github.com/lhao499/CoH)]. 

   

5. **Pretraining Language Models with Human Preferences.** *Tomasz Korbak, Kejian Shi, Angelica Chen, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.08582.pdf)].

   

6. **Constitutional AI: Harmlessness from AI Feedback.** *Yuntao Bai, Saurav Kadavath, Sandipan Kundu, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.08073.pdf)]; [[corpus](https://github.com/anthropics/ConstitutionalHarmlessnessPaper)].

   

7. **Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback.** *Yuntao Bai, Andy Jones, Kamal Ndousse, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2204.05862.pdf)]; [[corpus](https://github.com/anthropics/hh-rlhf)]. 

### 7.4 Scalable Oversight and Alignment

1. **Measuring Progress on Scalable Oversight for Large Language Models.** *Samuel R. Bowman, Jeeyoon Hyun, Ethan Perez, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2211.03540.pdf)].

2. **Aligning AI With Shared Human Values.** *Dan Hendrycks, Collin Burns, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, and Jacob Steinhardt.* ICLR 2021. [[pdf](https://openreview.net/pdf?id=dNy_RKzJacY)].

### 7.5 Other Papers

1. **Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models.** *Kaitlyn Zhou, Dan Jurafsky, and Tatsunori Hashimoto.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.13439.pdf)].

   

2. **The Capacity for Moral Self-Correction in Large Language Models.** *Deep Ganguli, Amanda Askell, Nicholas Schiefer, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.07459.pdf)]. 

   

3. **Large Language Models Can Be Easily Distracted by Irrelevant Context.** *Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed Chi, Nathanael Schärli, and Denny Zhou.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.00093.pdf)]; [[corpus](https://github.com/google-research-datasets/GSM-IC)].

4. **Language Models (Mostly) Know What They Know.** *Saurav Kadavath, Tom Conerly, Amanda Askell, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2207.05221.pdf)].

---

## ⭐ Star History

[![Star History Chart](https://api.star-history.com/svg?repos=RenzeLou/awesome-instruction-learning&type=Date)](https://star-history.com/#RenzeLou/awesome-instruction-learning&Date)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/RenzeLou/awesome-instruction-learning

Awesome Lists containing this project

README

Awesome Instruction Learning