Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/RenzeLou/awesome-instruction-learning
Papers and Datasets on Instruction Tuning and Following. ✨✨✨
https://github.com/RenzeLou/awesome-instruction-learning
List: awesome-instruction-learning
awesome-list datasets in-context-learning instruction instruction-learning instruction-tuning large-language-models paper-list pretrained-language-model prompt survey
Last synced: 2 months ago
JSON representation
Papers and Datasets on Instruction Tuning and Following. ✨✨✨
- Host: GitHub
- URL: https://github.com/RenzeLou/awesome-instruction-learning
- Owner: RenzeLou
- License: mit
- Created: 2023-02-21T01:43:05.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-04T19:48:35.000Z (10 months ago)
- Last Synced: 2024-05-22T00:06:25.720Z (8 months ago)
- Topics: awesome-list, datasets, in-context-learning, instruction, instruction-learning, instruction-tuning, large-language-models, paper-list, pretrained-language-model, prompt, survey
- Language: Python
- Homepage: https://arxiv.org/abs/2303.10475
- Size: 6.25 MB
- Stars: 408
- Watchers: 7
- Forks: 21
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-instruction-tuning - awesome-instruction-learning
- ultimate-awesome - awesome-instruction-learning - Papers and Datasets on Instruction Tuning and Following. ✨✨✨. (Other Lists / Monkey C Lists)
README
Awesome Instruction Learning
🔥🔥🔥 An awesome reading list of Instruction Tuning and Following, including papers and datasets.
👉 Explore our latest survey update! Feel free to dive in and discover the improvements we've made 👀 🤗 : Latest Survey---
## ❤️ Contribution
This repository is currently maintained by [Renze Lou](https://renzelou.github.io/) @ PennState and [Kai Zhang](https://drogozhang.github.io/) @ OhioState. **We appreciate any contributions** ❤️.
If you have any suggestions or find any missed papers, feel free to [reach out](https://outlook.office.com/mail/deeplink/compose?mailtouri=mailto%3Amarionojump0722%40gmail.com) or submit a [pull request](https://github.com/RenzeLou/awesome-instruction-learning/pulls):
1. Use following markdown format.
```markdown
**Paper Title.** *Author 1, Author 2, and Author 3.* Conference/Journal/Preprint Year. [[pdf](link)]; [[other resources](link)].
```2. If one preprint paper has multiple versions, please use **the earliest submitted year**.
3. Display the papers in **a year descending order** (the latest, the first).## 🥳 Citation
Find this repository helpful? 😊😊😊
Please consider citing our paper. 👇👇👇
```
@article{lou2023instruction,
title={A Comprehensive Survey on Instruction Following},
author={Lou, Renze and Zhang, Kai and Yin, Wenpeng},
journal={arXiv preprint arXiv:2303.10475},
year={2023}
}
```---
## 🔍 Table of Contents
- [1. 💁🏽♀️ Introduction](#1-️-introduction)
- [2. 🎓 Surveys and Tutorials](#2--surveys-and-tutorials)
- [3. 📚 Corpora](#3--corpora)
- [4. 🗂️ Taxonomy](#4-️-taxonomy)
- [4.1 Entailment-oriented Instruction](#41-entailment-oriented-instruction)
- [4.2 PLM-oriented Instruction](#42-plm-oriented-instruction)
- [4.3 Human-oriented Instruction](#43-human-oriented-instruction)
- [5. 📊 Analyses](#5--analyses)
- [5.1 Scale](#51-scale)
- [5.2 Explanability](#52-explanability)
- [5.3 Robustness and Safety](#53-robustness-and-safety)
- [5.4 Evaluation](#54-evaluation)
- [5.5 Negation](#55-negation)
- [5.6 Complexity](#56-complexity)
- [5.7 Other Papers](#57-other-papers)
- [6. 🤖 Applications](#6--applications)
- [6.1 Human-Computer Interaction](#61-human-computer-interaction)
- [6.2 Data and Feature Augmentation](#62-data-and-feature-augmentation)
- [6.3 General-purpose Language Models](#63-general-purpose-language-models)
- [6.4 Other Papers](#64-other-papers)
- [7. 📖 Extended Reading](#7--extended-reading)
- [7.1 Instruction Induction](#71-instruction-induction)
- [7.2 ChatGPT-related Papers](#72-chatgpt-related-papers)
- [7.3 Human Feedback vs. Model Feedback](#73-human-feedback-vs-model-feedback)
- [7.4 Scalable Oversight and Alignment](#74-scalable-oversight-and-alignment)
- [7.5 Other Papers](#75-other-papers)---
## 1. 💁🏽♀️ Introduction
Why *instruction-driven* learning instead of *example-driven* learning?
- 👉 **Affordable.** For the conventional example-driven supervised learning, each *downstream* task usually requires extensive labeled examples 💰. While for instruction learning, each *downstream* task may require only one instruction and just a few examples 🤩.
- 👉 **One model, all tasks.** An ideal AI system should be able to quickly understand and handle various new tasks 💫.
- 👉 **A promising research direction.** Traditional example-driven supervised learning uses labeled instances to represent the task semantics, i.e., training models by observing numerous examples to recover the original task meaning. Therefore, **why not directly use the task instruction**, **which has already occupied the essential task semantics**?## 2. 🎓 Surveys and Tutorials
We use the label ![comprehensive](https://img.shields.io/badge/comprehensive-FFA07A) to denote the papers with a more comprehensive perspective. While some other papers are more specific to a certain in-context instruction, including ![prompt](https://img.shields.io/badge/prompt-90EE90), few-shot ![in-context demonstrations](https://img.shields.io/badge/demonstrations-FFB6C1), and CoT ![reasoning](https://img.shields.io/badge/reasoning-9cf).
1. **A Comprehensive Survey on Instruction Following.** *Renze Lou, Kai Zhang, and Wenpeng Yin.* Preprint 2023. [[pdf](https://arxiv.org/abs/2303.10475)]; [[paper list](https://github.com/RenzeLou/awesome-instruction-learning)]. ![comprehensive](https://img.shields.io/badge/comprehensive-FFA07A)
2. **Learning from Task Instructions.** *Wenpeng Yin, Qinyuan Ye, Pengfei Liu, Xiang Ren, and Hinrich Schütze.* EMNLP Tutorial 2023. [[pdf](https://aclanthology.org/2023.emnlp-tutorial.4.pdf)]. ![comprehensive](https://img.shields.io/badge/comprehensive-FFA07A)
3. **Nature Language Reasoning, A Survey.** *Fei Yu, Hongbo Zhang, and Benyou Wang.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2303.14725.pdf)]; [[paper list](https://github.com/FreedomIntelligence/ReasoningNLP)]. ![reasoning](https://img.shields.io/badge/reasoning-9cf)4. **Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.** *Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig.* ACM Computing Surveys 2023. [[pdf](https://dl.acm.org/doi/pdf/10.1145/3560815)]; [[website](http://pretrain.nlpedia.ai/)]. ![prompt](https://img.shields.io/badge/prompt-90EE90)
5. **A Survey on In-context Learning**. *Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, Lei Li, and Zhifang Sui*. Preprint 2022. [[pdf](https://arxiv.org/pdf/2301.00234.pdf)]. ![in-context demonstrations](https://img.shields.io/badge/demonstrations-FFB6C1)
6. **Towards Reasoning in Large Language Models: A Survey.** *Jie Huang, and Kevin Chen-Chuan Chang.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.10403.pdf)]; [[paper list](https://github.com/jeffhj/LM-reasoning)]. ![reasoning](https://img.shields.io/badge/reasoning-9cf)7. **Reasoning with Language Model Prompting: A Survey.** *Shuofei Qiao, Yixin Ou, Ningyu Zhang, Xiang Chen, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, and Huajun Chen.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.09597.pdf)]; [[paper list](https://github.com/zjunlp/Prompt4ReasoningPapers)]. ![reasoning](https://img.shields.io/badge/reasoning-9cf)
## 3. 📚 Corpora
**The high-quality dataset is the key factor for successful instruction tuning**. Therefore, we put the "corpora" section here to emphasize its importance.
We carefully design the following table, make it easy to be referred to, and keep it up-to-date. Hope it can contribute to future research of instruction tuning. 🤗
*(Some rows come from [Longpre et al.](https://arxiv.org/pdf/2301.13688.pdf), thanks for their great work ❤️.)*
Name
Release
Data/Code
Scale
Language
Annotator#Tasks
#Ins. (K)UnifiedQA
05/2020
Link
46
750
✍ HumanCrossFit
04/2021
Link
159
71,000
✍ HumanNatural Inst. v1
04/2021
Link
61
620
✍ HumanFlan 2021
09/2021
Link
62
4,400
✍ HumanP3
10/2021
Link
62
12,000
✍ HumanMetaICL
10/2021
Link
142
3,500
✍ HumanExMix
11/2021
Link
107
500
✍ Human
04/2022
Link
1,613
5,000
✍ HumanGLM
10/2022
Link
77
12,000
✍ HumanFlan 2022
10/2022
Link
1,836
15,000
✍ HumanxP3
11/2022
Link
71
81,000
✍ HumanUnnatural Inst.
12/2022
Link
117
64🤖 InstructGPT002
text-davinci-002
Self-Instruct
12/2022
Link
/
82🤖 GPT-3
davinci
OPT-IML
12/2022
/
2,207
18,000
✍ Human🤖 InstructGPT003
text-davinci-003
🤖 ChatGPT
Koala
04/2023
/
/
/
✍ Human
🤖 ChatGPT
✍ Human
🤖 ChatGPT
Alpaca-gpt4
04/2023
Link
/
113🤖 GPT-4
gpt-4
Vicuna
04/2023
/
/
76
✍ Human
🤖 ChatGPT
Dolly
04/2023
Link
/
15
✍ Human
✍ Human
✍ Human
🤖 InstructGPT003
text-davinci-003
Symbolic-Instruct
04/2023
Link
/
796✍ Human
Synthetic Examples
🤖 ChatGPT
🤖 ChatGPT
✍ Human
UltraChat
05/2023
Link
/
1,500
🤖 ChatGPT
CoT Collection
05/2023
Link
1,060
1,880🤖 Codex
Dynosaur
05/2023
Link
5,740
801🤖 ChatGPT
🤖 ChatGPT
🤖 GPT-4
✍ Human
Dynamics-of-Instruction
10/2023
Link
/
40✍ Human
✍ Human
🤖 ChatGPT
WaveCoder
12/2023
Link
4 code-related tasks
20🤖 ChatGPT
🤖 GPT-4
🤖 GPT-4
## 4. 🗂️ Taxonomy
In our paper, we divide the textual instructions into three categories.
### 4.1 Entailment-oriented Instruction
![entailment_oriented](./resources/entailment_oriented.png)
Entailment-oriented instruction regards the task **input** as the **premise**, and constructs the task **output** into the **hypothesis**. It unifies the conventional classification problems into a textual entailment paradigm.
1. **A Universal Discriminator for Zero-Shot Generalization.** *Haike Xu, Zongyu Lin, Jing Zhou, Yanan Zheng, and Zhilin Yang.* ACL 2023. [[pdf](https://arxiv.org/pdf/2211.08099.pdf)]; [[code](https://github.com/Rafa-zy/UD)].
2. **ConEntail: An Entailment-based Framework for Universal Zero and Few Shot Classification with Supervised Contrastive Pretraining.** *Ranran Haoran Zhang, Aysa Xuemo Fan, and Rui Zhang.* EACL 2023. [[pdf](https://arxiv.org/pdf/2210.07587.pdf)]; [[code](https://github.com/psunlpgroup/ConEntail)].
3. **OpenStance: Real-world Zero-shot Stance Detection.** *Hanzi Xu, Slobodan Vucetic, and Wenpeng Yin.* CoNLL 2022. [[pdf](https://arxiv.org/pdf/2210.14299.pdf)]; [[code](https://github.com/xhz0809/OpenStance)].
4. **Ultra-fine Entity Typing with Indirect Supervision from Natural Language Inference.** *Bangzheng Li, Wenpeng Yin, and Muhao Chen.* TACL 2022. [[pdf](https://aclanthology.org/2022.tacl-1.35.pdf)]; [[code](https://github.com/luka-group/lite)].
5. **Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning.** *Oscar Sainz, Itziar Gonzalez-Dios, Oier Lopez de Lacalle, Bonan Min, and Eneko Agirre.* Findings of NAACL 2022. [[pdf](https://aclanthology.org/2022.findings-naacl.187.pdf)]; [[code](https://github.com/luka-group/lite)].6. **Label Verbalization and Entailment for Effective Zero and Few-Shot Relation Extraction.** *Oscar Sainz, Oier Lopez de Lacalle, Gorka Labaka, Ander Barrena, and Eneko Agirre.* EMNLP 2021. [[pdf](https://aclanthology.org/2021.emnlp-main.92.pdf)]; [[code](https://github.com/osainz59/Ask2Transformers)].
7. **Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections.** *Ruiqi Zhong, Kristy Lee, Zheng Zhang, and Dan Klein.* Findings of EMNLP 2021. [[pdf](https://aclanthology.org/2021.findings-emnlp.244.pdf)]; [[code](https://github.com/ruiqi-zhong/Meta-tuning)].
8. **Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System.** *Congying Xia, Wenpeng Yin, Yihao Feng, and Philip Yu.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.106.pdf)]; [[code](https://github.com/congyingxia/IncrementalFSTC)].
9. **ExpBERT: Representation Engineering with Natural Language Explanations.** *Shikhar Murty, Pang Wei Koh, and Percy Liang.* ACL 2020. [[pdf](https://aclanthology.org/2020.acl-main.190.pdf)]; [[code](https://github.com/MurtyShikhar/ExpBERT)].
10. **Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach.** *Wenpeng Yin, Jamaal Hay, Dan Roth* *.* EMNLP 2019. [[pdf](https://arxiv.org/pdf/1909.00161.pdf)]; [[website](https://cogcomp.seas.upenn.edu/page/publication_view/883)].### 4.2 PLM-oriented Instruction
![plm_oriented](./resources/PLM_oriented.png)
PLM-oriented instruction (i.e., prompt) aims to construct a cloze-style input to steer pre-trained language models (PLM) for responses. Here, we diaplay several representative works of PLM-oriented instruction learning. For more works, please refer to [this repository](https://github.com/thunlp/PromptPapers) and [this survey](https://dl.acm.org/doi/pdf/10.1145/3560815).
1. **How Does In-Context Learning Help Prompt Tuning?** *Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, and Mohit Iyyer.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.11521.pdf)].
2. **Demystifying Prompts in Language Models via Perplexity Estimation.** *Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, and Luke Zettlemoyer.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.04037.pdf)].
3. **RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning.** *Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, and et al.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2205.12548.pdf)]; [[code](https://github.com/mingkaid/rl-prompt)].
4. **PPT: Pre-trained Prompt Tuning for Few-shot Learning.** *Yuxian Gu, Xu Han, Zhiyuan Liu, and Minlie Huang.* ACL 2022. [[pdf](https://arxiv.org/pdf/2109.04332.pdf)]; [[code](https://github.com/thu-coai/PPT)].
5. **P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks.** *Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, and Jie Tang.* ACL 2022. [[pdf](https://arxiv.org/pdf/2110.07602.pdf)]; [[code](https://github.com/THUDM/P-tuning-v2)].
6. **KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction.** *Xiang Chen, Ningyu Zhang, Xin Xie, and et al.* WWW 2022. [[pdf](http://128.84.21.203/pdf/2104.07650)]; [[code](https://github.com/zjunlp/KnowPrompt)].
7. **GPT Understands, Too.** *Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, and Jie Tang.* Preprint 2021. [[pdf](https://arxiv.org/pdf/2103.10385.pdf)]; [[code](https://github.com/THUDM/P-tuning)].
8. **Few-Shot Text Generation with Natural Language Instructions.** *Timo Schick and Hinrich Schütze.* EMNLP 2021. [[pdf](https://aclanthology.org/2021.emnlp-main.32.pdf)]; [[code](https://github.com/timoschick/pet)].
9. **It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners.** *Timo Schick and Hinrich Schütze.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.185.pdf)]; [[code](https://github.com/timoschick/pet)].
10. **Learning How to Ask: Querying LMs with Mixtures of Soft Prompts.** *Guanghui Qin and Jason Eisner.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.410.pdf)]; [[code](https://github.com/hiaoxui/soft-prompts)].
11. **Prefix-Tuning: Optimizing Continuous Prompts for Generation.** *Xiang Lisa Li and Percy Liang.* ACL 2021. [[pdf](https://aclanthology.org/2021.acl-long.353.pdf)]; [[code](https://github.com/XiangLi1999/PrefixTuning)].
12. **Making Pre-trained Language Models Better Few-shot Learners.** *Tianyu Gao, Adam Fisch, and Danqi Chen.* ACL 2021. [[pdf](https://aclanthology.org/2021.acl-long.295.pdf)]; [[code](https://github.com/princeton-nlp/LM-BFF)].
13. **Template-Based Named Entity Recognition Using BART.** *Leyang Cui, Yu Wu, Jian Liu, Sen Yang, and Yue Zhang.* Findings of ACL 2021. [[pdf](https://aclanthology.org/2021.findings-acl.161.pdf)]; [[code](https://github.com/Nealcly/templateNER)].
14. **Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference.** *Timo Schick and Hinrich Schütze.* EACL 2021. [[pdf](https://aclanthology.org/2021.eacl-main.20.pdf)]; [[code](https://github.com/timoschick/pet)].
15. **Language Models are Unsupervised Multitask Learners.** *Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever.* Preprint 2019. [[pdf](https://life-extension.github.io/2020/05/27/GPT%E6%8A%80%E6%9C%AF%E5%88%9D%E6%8E%A2/language-models.pdf)].### 4.3 Human-oriented Instruction
![Human-oriented Instruction](./resources/human_oriented.png)
Human-oriented instruction is initially designed for human to understand the task and annotate the data, such as the [Amazon MTurk](https://www.mturk.com/) Instructions, which provides sufficient information about the task (e.g., detailed definition).
1. **Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors.** *Kai Zhang, Bernal Jiménez Gutiérrez, and Yu Su.* Findings of ACL 2023. [[pdf](https://arxiv.org/pdf/2305.11159.pdf)]; [[code](https://github.com/OSU-NLP-Group/QA4RE)].
2. **Symbol tuning improves in-context learning in language models.** *Jerry Wei, Le Hou, Andrew Lampinen, Xiangning Chen, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.08298.pdf)].
3. **Small Models are Valuable Plug-ins for Large Language Models.** *Canwen Xu, Yichong Xu, Shuohang Wang, Yang Liu, Chenguang Zhu, and Julian McAuley.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.08848.pdf)]; [[code](https://github.com/JetRunner/SuperICL)].
4. **How Many Data Samples is an Additional Instruction Worth?** *Ravsehaj Singh Puri, Swaroop Mishra, Mihir Parmar, and Chitta Baral.* Findings of EACL 2023. [[pdf](https://arxiv.org/pdf/2203.09161.pdf)]; [[code](https://github.com/Ravsehajsinghpuri/Multi-Variant-Instructions)].
5. **In-Context Instruction Learning.** *Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, and Minjoon Seo.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.14691.pdf)]; [[code](https://github.com/seonghyeonye/ICIL)].
6. **InstructABSA: Instruction Learning for Aspect Based Sentiment Analysis.** *Kevin Scaria, Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, and Chitta Baral.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.08624.pdf)]; [[code](https://github.com/kevinscaria/InstructABSA)].
7. **HINT: Hypernetwork Instruction Tuning for Efficient Zero-Shot Generalisation.** *Hamish Ivison, Akshita Bhagia, Yizhong Wang, Hannaneh Hajishirzi, and Matthew Peters.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.10315.pdf)].8. **Boosting Natural Language Generation from Instructions with Meta-Learning.** *Budhaditya Deb, Guoqing Zheng, and Ahmed Hassan Awadallah.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2210.11617.pdf)].
9. **GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models.** *Archiki Prasad, Peter Hase, Xiang Zhou, and Mohit Bansal.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2203.07281.pdf)]; [[code](https://github.com/archiki/GrIPS)].
10. **ConTinTin: Continual Learning from Task Instructions.** *Wenpeng Yin, Jia Li, and Caiming Xiong.* ACL 2022. [[pdf](https://aclanthology.org/2022.acl-long.218.pdf)].
11. **InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning.** *Prakhar Gupta, Cathy Jiao, Yi-Ting Yeh, Shikib Mehri, Maxine Eskenazi, and Jeffrey P. Bigham.* EMNLP 2022. [[pdf]([link](http://128.84.21.203/pdf/2205.12673))]; [[code](https://github.com/prakharguptaz/Instructdial)].
12. **Learning to Generate Task-Specific Adapters from Task Description.** *Qinyuan Ye and Xiang Ren.* ACL 2021. [[pdf](https://aclanthology.org/2021.acl-short.82.pdf)]; [[code](https://github.com/INK-USC/hypter)].
13. **The Turking Test: Can Language Models Understand Instructions?** *Avia Efrat and Omer Levy.* Preprint 2020. [[pdf](https://arxiv.org/pdf/2010.11982.pdf)].## 5. 📊 Analyses
### 5.1 Scale
The model and task scale are found to be important for instruction-based fine-tuning. Basically, the larger model scale brings more benefits to the generalization, and so does the task scale. However, some works raised objections (e.g., [Jang et al.](https://arxiv.org/pdf/2302.03202.pdf) and [Wang et al.](https://arxiv.org/pdf/2210.00185.pdf)).
1. **Exploring the Benefits of Training Expert Language Models over Instruction Tuning.** *Joel Jang, Seungone Kim, Seonghyeon Ye, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.03202.pdf)]; [[code](https://github.com/joeljang/ELM)].
2. **The Flan Collection: Designing Data and Methods for Effective Instruction Tuning.** *Shayne Longpre, Le Hou, Tu Vu, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2301.13688.pdf)]; [[code](https://github.com/google-research/FLAN/tree/main/flan/v2)]; [[corpus](https://huggingface.co/datasets/SirNeural/flan_v2)].
3. **UL2: Unifying Language Learning Paradigms.** *Yi Tay, Mostafa Dehghani, Vinh Q. Tran, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2205.05131.pdf)]; [[checkpoint](https://huggingface.co/google/flan-ul2)].
4. **OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization.** *Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.12017.pdf)].
5. **Scaling Instruction-Finetuned Language Models.** *Hyung Won Chung, Le Hou, Shayne Longpre, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2210.11416.pdf)]; [[checkpoint](https://huggingface.co/docs/transformers/model_doc/flan-t5)].
6. **Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization.** *Yuxian Gu, Pei Ke, Xiaoyan Zhu, and Minlie Huang.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2210.09175.pdf)]; [[code](https://github.com/thu-coai/UDIT)].
7. **Emergent Abilities of Large Language Models.** *Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, and et al.* TMLR 2022. [[pdf](https://openreview.net/pdf?id=yzkSU5zdwD)].
8. **Multitask Prompted Training Enables Zero-Shot Task Generalization.** *Victor Sanh, Albert Webson, Colin Raffel, and et al.* ICLR 2022. [[pdf](https://openreview.net/pdf?id=9Vrb9D0WI4)]; [[checkpoint](https://github.com/bigscience-workshop/t-zero)]; [[corpus](https://github.com/bigscience-workshop/promptsource)].
9. **Finetuned Language Models are Zero-Shot Learners.** *Jason Wei, Maarten Bosma, Vincent Zhao, and et al.* ICLR 2022. [[pdf](https://openreview.net/pdf?id=gEZrGCozdqR)]; [[code](https://github.com/google-research/flan)].
10. **Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks.** *Zhenhailong Wang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen, and Heng Ji.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2210.00185.pdf)]; [[code](https://github.com/MikeWangWZHL/Zemi)].
11. **ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization.** *Hanwei Xu, Yujun Chen, Yulun Du, Nan Shao, Yanggang Wang, Haiyu Li, and Zhilin Yang.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2201.06910.pdf)].
12. **The Power of Scale for Parameter-Efficient Prompt Tuning.** *Brian Lester, Rami Al-Rfou, and Noah Constant.* EMNLP 2021. [[pdf](https://aclanthology.org/2021.emnlp-main.243.pdf)]; [[code](https://github.com/google-research/prompt-tuning)].### 5.2 Explanability
We exhibit works that focus on the interpretability and reliability of instruction learning, i.e., explaining *when* and *why* instruction can take effect.
1. **What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning.** *Jane Pan, Tianyu Gao, Howard Chen, and Danqi Chen.* Findings of ACL 2023. [[pdf](https://arxiv.org/pdf/2305.09731.pdf)]; [[code](https://github.com/princeton-nlp/WhatICLLearns)].
2. **REV: Information-Theoretic Evaluation of Free-Text Rationales.** *Hanjie Chen, Faeze Brahman, Xiang Ren, and et al.* ACL 2023. [[pdf](https://arxiv.org/pdf/2210.04982.pdf)]; [[code](https://github.com/HanjieChen/REV)].
3. **Interpretability at Scale: Identifying Causal Mechanisms in Alpaca.** *Zhengxuan Wu, Atticus Geiger, Christopher Potts, and Noah D. Goodman.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.08809.pdf)]; [[code](https://github.com/frankaging/align-transformers)].
4. **Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning.** *Xinyi Wang, Wanrong Zhu, Michael Saxon, Mark Steyvers, and William Yang Wang.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2301.11916.pdf)]; [[code](https://github.com/WANGXinyiLinda/concept-based-demonstration-selection)].
5. **The Learnability of In-Context Learning.** *Noam Wies, Yoav Levine, and Amnon Shashua.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2303.07895.pdf)].
6. **Why think step-by-step? Reasoning emerges from the locality of experience.** *Ben Prystawski, and Noah D. Goodman.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2304.03843.pdf)].
7. **Larger language models do in-context learning differently.** *Jerry Wei, Jason Wei, Yi Tay, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2303.03846.pdf)].
8. **What learning algorithm is in-context learning? Investigations with linear models.** *Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, and Denny Zhou.* ICLR 2023. [[pdf](https://openreview.net/pdf?id=0g0X4H8yN4I)]; [[code](https://github.com/ekinakyurek/google-research/tree/master/incontext)].
9. **Can language models learn from explanations in context?** *Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, and et al.* Findings of EMNLP 2022. [[pdf](https://arxiv.org/pdf/2204.02329.pdf)].
10. **Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?** *Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2202.12837.pdf)]; [[code](https://github.com/Alrope123/rethinking-demonstrations)].
11. **Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts.** *Daniel Khashabi, Xinxi Lyu, Sewon Min, and et al.* NAACL 2022. [[pdf](https://aclanthology.org/2022.naacl-main.266.pdf)]; [[code](https://github.com/Alrope123/prompt-waywardness)].
12. **Do Prompt-Based Models Really Understand the Meaning of Their Prompts?.** *Albert Webson and Ellie Pavlick.* NAACL 2022. [[pdf](https://aclanthology.org/2022.naacl-main.167.pdf)]; [[code](https://github.com/awebson/prompt_semantics)].
13. **Reframing Instructional Prompts to GPTk’s Language.** *Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, and Hannaneh Hajishirzi.* Findings of ACL 2022. [[pdf](https://aclanthology.org/2022.findings-acl.50.pdf)]; [[code](https://github.com/allenai/reframing/)].
14. **What Makes Good In-Context Examples for GPT-3?** *Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen.* ACL Workshop 2022. [[pdf](https://aclanthology.org/2022.deelio-1.10.pdf)]; [[code](https://github.com/jiachangliu/KATEGPT3)].
15. **Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity.** *Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp.* ACL 2022. [[pdf](https://aclanthology.org/2022.acl-long.556.pdf)].
16. **Calibrate Before Use: Improving Few-shot Performance of Language Models.** *Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh.* ICML 2021. [[pdf](https://arxiv.org/pdf/2102.09690.pdf)]; [[code](https://github.com/tonyzhaozh/few-shot-learning)].### 5.3 Robustness and Safety
1. **Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection.** *Jun Yan, Vikas Yadav, Shiyang Li, and et al.* Workshop @ NeurIPS 2023. [[pdf](https://arxiv.org/abs/2307.16888)].
2. **Evaluating the Zero-shot Robustness ofInstruction-tuned Language Models.** *Jiuding Sun, Chantal Shaib, and Byron C. Wallace.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2306.11270.pdf)].
3. **Poisoning Language Models During Instruction Tuning.** *Alexander Wan, Eric Wallace, Sheng Shen, and Dan Klein.* ICML 2023. [[pdf](https://arxiv.org/pdf/2305.00944.pdf)]; [[code](https://github.com/AlexWan0/Poisoning-Instruction-Tuned-Models)].
4. **Multi-step Jailbreaking Privacy Attacks on ChatGPT.** *Haoran Li, Dadi Guo, Wei Fan, Mingshi Xu, Jie Huang, Fanpu Meng, and Yangqiu Song.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2304.05197.pdf)].
5. **More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models.** *Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.12173.pdf)]; [[code](https://github.com/greshake/llm-security)].
6. **Robustness of Learning from Task Instructions.** *Jiasheng Gu, Hanzi Xu, Liangyu Nie, and Wenpeng Yin.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.03813.pdf)].7. **Learning from Task Descriptions.** *Orion Weller, Nicholas Lourie, Matt Gardner, and Matthew E. Peters.* EMNLP 2020. [[pdf](https://aclanthology.org/2020.emnlp-main.105.pdf)]; [[code](https://github.com/allenai/zest)]; [[corpus](https://allenai.org/data/zest)].
### 5.4 Evaluation
Stop using old-school automatic metrics to evaluate your instruction-tuned system; try more advanced methods to do it comprehensively!1. **Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2.** *Hamish Ivison, Yizhong Wang, Valentina Pyatkin, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2311.10702.pdf)]; [[model&data](https://huggingface.co/collections/allenai/tulu-v2-suite-6551b56e743e6349aab45101)]
2. **How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources.** *Yizhong Wang, Hamish Ivison, Pradeep Dasigi, and et al.* NeurIPS Datasets and Benchmarks 2023. [[pdf](https://arxiv.org/pdf/2306.04751.pdf)]; [[code](https://github.com/allenai/open-instruct)].3. **Instruction-following Evaluation through Verbalizer Manipulation.** *Shiyang Li, Jun Yan, Hai Wang, Zheng Tang, Xiang Ren, Vijay Srinivasan, Hongxia Jin* Preprint 2023. [[pdf](https://arxiv.org/pdf/2307.10558.pdf)].
4. **INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models.** *Yew Ken Chia, Pengfei Hong, Lidong Bing, and Soujanya Poria.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2306.04757.pdf)]; [[code](https://github.com/declare-lab/instruct-eval)]; [[leaderboard](https://declare-lab.net/instruct-eval/)].### 5.5 Negation
Negation expressions, such as `do not` and `avoid doing`, are difficult for models to corretly understand and follow.
1. **Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts.** *Joel Jang, Seonghyeon Ye, and Minjoon Seo.* ICML Workshop 2023. [[pdf](https://proceedings.mlr.press/v203/jang23a/jang23a.pdf)].
2. **Understanding by Understanding Not: Modeling Negation in Language Models.** *Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, and et al.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.102.pdf)]; [[code](https://github.com/arianhosseini/negation-learning)].### 5.6 Complexity
Papers are focusing on enhancing the complexity of instructions to enhance model competence. More complex data in the mix of instruction data, more competent performance model could achieve.
1. **Wizardlm: Empowering large language models to follow complex instructions.** *Xu, Can and Sun, Qingfeng and Zheng, Kai and Geng, Xiubo and Zhao, Pu and Feng, Jiazhan and Tao, Chongyang and Jiang, Daxin*. Prepint 2023. [[pdf](https://arxiv.org/pdf/2304.12244.pdf)]; [[code](https://github.com/nlpxucan/WizardLM)].
2. **Orca: Progressive learning from complex explanation traces of gpt-4.** *Mukherjee, Subhabrata and Mitra, Arindam and Jawahar, Ganesh and Agarwal, Sahaj and Palangi, Hamid and Awadallah, Ahmed*. Prepint 2023. [[pdf](https://arxiv.org/pdf/2306.02707.pdf)].
3. **A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment.** *Zhao, Yingxiu and Yu, Bowen and Hui, Binyuan and Yu, Haiyang and Huang, Fei and Li, Yongbin and Zhang, Nevin L*. Prepint 2023. [[pdf](https://arxiv.org/pdf/2308.05696.pdf)]; [[code](https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/tree-instruct)].
### 5.7 Other Papers
1. **Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions.** *Mihir Parmar, Swaroop Mishra, Mor Geva, and Chitta Baral.* EACL 2023. [[pdf](https://arxiv.org/pdf/2205.00415.pdf)]; [[code](https://github.com/Mihir3009/instruction-bias)].
2. **Instruction Tuned Models are Quick Learners.** *Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2306.05539.pdf)]; [[code](https://github.com/srsawant34/efficient_instruction_learning)].
3. **Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning.** *Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin Raffel.* NeurIPS 2022. [[pdf](https://openreview.net/pdf?id=rBCvMG-JsPd)]; [[code](https://github.com/r-three/t-few)].
4. **A Survey of NLP-Related Crowdsourcing HITs: what works and what does not.** *Jessica Huynh, Jeffrey Bigham, and Maxine Eskenazi.* Preprint 2021. [[pdf](https://arxiv.org/pdf/2111.05241.pdf)].
## 6. 🤖 Applications
### 6.1 Human-Computer Interaction
Instructions are used in various human-computer interaction (HCI) tasks, such as virtual assistants, chatbots, etc.
1. **Help me write a poem: Instruction Tuning as a Vehicle for Collaborative Poetry Writing.** *Tuhin Chakrabarty, Vishakh Padmakumar, and He He.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2210.13669.pdf)]; [[code](https://github.com/vishakhpk/creative-instructions)].
2. **HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models.** *Swaroop Mishra, and Elnaz Nouri.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2208.08232.pdf)].
3. **EditEval: An Instruction-Based Benchmark for Text Improvements.** *Jane Dwivedi-Yu, Timo Schick, Zhengbao Jiang, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2209.13331.pdf)]; [[code](https://github.com/facebookresearch/EditEval)]; [[website](https://eval.ai/web/challenges/challenge-page/1866/overview)].
4. **Communicating Natural Programs to Humans and Machines.** *Sam Acquaviva, Yewen Pu, Marta Kryven, and et al.* NeurIPS Workshop 2022. [[pdf](https://openreview.net/pdf?id=OxFoLTKDcNm)]; [[code](https://github.com/samacqua/LARC)].
5. **Interactive Task Learning from GUI-Grounded Natural Language Instructions and Demonstrations.** *Toby Jia-Jun Li, Tom Mitchell, and Brad Myers.* ACL Demo 2020. [[pdf](https://aclanthology.org/2020.acl-demos.25.pdf)]; [[code](https://github.com/tobyli/Sugilite_development)]; [[video](https://www.youtube.com/watch?v=tdHEk-GeaqE)].
6. **Multi-Modal Interactive Task Learning from Demonstrations and Natural Language Instructions.** *Toby Jia-Jun Li.* UIST 2020. [[pdf](https://dl.acm.org/doi/pdf/10.1145/3379350.3415803)]; [[code](https://github.com/tobyli/Sugilite_development)].
7. **Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following.** *David Gaddy, and Dan Klein.* ACL 2019. [[pdf](https://aclanthology.org/P19-1188.pdf)].
8. **VirtualHome: Simulating Household Activities via Programs.** *Xavier Puig, Kevin Ra, Marko Boben, and et al.* CVPR 2018. [[pdf](https://openaccess.thecvf.com/content_cvpr_2018/papers/Puig_VirtualHome_Simulating_Household_CVPR_2018_paper.pdf)]; [[website](http://virtual-home.org/)].
9. **Natural Language Communication with Robots.** *Yonatan Bisk, Deniz Yuret, and Daniel Marcu.* NAACL 2016. [[pdf](https://aclanthology.org/N16-1089.pdf)]; [[website](https://groundedlanguage.github.io/)].
10. **Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World.** *Jayant Krishnamurthy, and Thomas Kollar.* TACL 2013. [[pdf](http://rtw.ml.cmu.edu/tacl2013_lsp/tacl2013-krishnamurthy-kollar.pdf)]; [[code](http://rtw.ml.cmu.edu/tacl2013_lsp/)].11. **Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions.** *Yoav Artzi, and Luke Zettlemoyer.* TACL 2013. [[pdf](https://aclanthology.org/Q13-1005.pdf)].
12. **Unsupervised PCFG Induction for Grounded Language Learning with Highly Ambiguous Supervision.** *Joohyun Kim, and Raymond Mooney.* EMNLP 2012. [[pdf](https://aclanthology.org/D12-1040.pdf)].
13. **A joint model of language and perception for grounded attribute learning.** *Cynthia Matuszek, Nicholas FitzGerald, Luke Zettlemoyer, Liefeng Bo, and Dieter Fox.* ICML 2012. [[pdf](https://arxiv.org/pdf/1206.6423.pdf)].
14. **Learning to Interpret Natural Language Instructions.** *Monica Babeş-Vroman, James MacGlashan, Ruoyuan Gao, and et al.* ACL Workshop 2012. [[pdf](https://aclanthology.org/W12-2801.pdf)].
15. **Fast Online Lexicon Learning for Grounded Language Acquisition.** *David Chen.* ACL 2012. [[pdf](https://aclanthology.org/P12-1045.pdf)].
16. **Learning to Win by Reading Manuals in a Monte-Carlo Framework.** *S.R.K. Branavan, David Silver, and Regina Barzilay.* ACL 2011. [[pdf](https://aclanthology.org/P11-1028.pdf)]; [[website](http://groups.csail.mit.edu/rbg/code/civ/)].
17. **Learning from natural instructions.** *Dan Goldwasse, and Dan Roth.* IJCAI 2011. [[pdf](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=2aba84801935041774c1e2b749e0331efa322ed8)].
18. **Learning to Interpret Natural Language Navigation Instructions from Observations.** *David L. Chen and Raymond J. Mooney.* AAAI 2011. [[pdf](https://www.cs.utexas.edu/users/ml/papers/chen.aaai11.pdf)].
19. **Approaching the Symbol Grounding Problem with Probabilistic Graphical Models.** *Stefanie Tellex, Thomas Kollar, Steven Dickerson, and et al.* AAAI 2011. [[pdf](https://cs.brown.edu/people/stellex/publications/tellex11a.pdf)].
20. **Driving Semantic Parsing from the World’s Response.** *James Clarke, Dan Goldwasser, Ming-Wei Chang, and Dan Roth.* CoNLL 2010. [[pdf](https://aclanthology.org/W10-2903.pdf)].
21. **Learning to Follow Navigational Directions.** *Adam Vogel, and Daniel Jurafsky.* ACL 2010. [[pdf](https://aclanthology.org/P10-1083.pdf)].
22. **Reading between the Lines: Learning to Map High-Level Instructions to Commands.** *S.R.K. Branavan, Luke Zettlemoyer, and Regina Barzilay.* ACL 2010. [[pdf](https://aclanthology.org/P10-1129.pdf)]; [[website](http://groups.csail.mit.edu/rbg/code/rl-hli/)].
23. **Reading to Learn: Constructing Features from Semantic Abstracts.** *Jacob Eisenstein, James Clarke, Dan Goldwasser, and Dan Roth.* EMNLP 2009. [[pdf](https://aclanthology.org/D09-1100.pdf)]; [[website](http://www.comlab.ox.ac.uk/activities/machinelearning/Aleph/)].
24. **Learning Semantic Correspondences with Less Supervision.** *Percy Liang, Michael Jordan, and Dan Klein.* ACL 2009. [[pdf](https://aclanthology.org/P09-1011.pdf)].
25. **Reinforcement Learning for Mapping Instructions to Actions.** *S.R.K. Branavan, Harr Chen, Luke Zettlemoyer, and Regina Barzilay.* ACL 2009. [[pdf](https://aclanthology.org/P09-1010.pdf)]; [[website](http://groups.csail.mit.edu/rbg/code/rl/)].
26. **Learning to sportscast: a test of grounded language acquisition.** *David L. Chen and Raymond J. Mooney.* ICML 2008. [[pdf](https://dl.acm.org/doi/pdf/10.1145/1390156.1390173)].
27. **Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer.** *Gregory Kuhlmann, Peter Stone, Raymond Mooney, and Jude Shavlik.* AAAI Workshop 2004. [[pdf](https://ftp.cs.wisc.edu/machine-learning/shavlik-group/kuhlmann-aaai04.pdf)]; [[website](http://www.cs.utexas.edu/AustinVilla/sim/keepaway/)].### 6.2 Data and Feature Augmentation
Some instructions (e.g., label explanations) are also be used for automatic annotation (i.e., data augmentation), or for enriching feature.
1. **One Embedder, Any Task: Instruction-Finetuned Text Embeddings.** *Hongjin Su, Weijia Shi, Jungo Kasai, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.09741.pdf)]; [[website](https://instructor-embedding.github.io/)].
2. **Prompt Consistency for Zero-Shot Task Generalization.** *Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, and Graham Neubig.* Findings of EMNLP 2022. [[pdf](https://arxiv.org/pdf/2205.00049.pdf)]; [[code](https://github.com/violet-zct/swarm-distillation-zero-shot)].
3. **Teaching Machine Comprehension with Compositional Explanations.** *Qinyuan Ye, Xiao Huang, Elizabeth Boschee, and Xiang Ren.* Findings of EMNLP 2020. [[pdf](https://aclanthology.org/2020.findings-emnlp.145.pdf)]; [[code](https://github.com/INK-USC/mrc-explanation)].
4. **Learning from Explanations with Neural Execution Tree.** *Ziqi Wang, Yujia Qin, Wenxuan Zhou, Jun Yan, Qinyuan Ye, Leonardo Neves, Zhiyuan Liu, and Xiang Ren.* ICLR 2020. [[pdf](https://openreview.net/pdf?id=rJlUt0EYwS)]; [[website](http://inklab.usc.edu/project-NExT/)].
5. **Training Classifiers with Natural Language Explanations.** *Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, and Christopher Ré.* ACL 2018. [[pdf](https://aclanthology.org/P18-1175.pdf)]; [[code](https://github.com/HazyResearch/babble)].
6. **Zero-shot Learning of Classifiers from Natural Language Quantification.** *Shashank Srivastava, Igor Labutov, and Tom Mitchell.* ACL 2018. [[pdf](https://aclanthology.org/P18-1029.pdf)].
7. **Joint Concept Learning and Semantic Parsing from Natural Language Explanations.** *Shashank Srivastava, Igor Labutov, and Tom Mitchell.* EMNLP 2017. [[pdf](https://aclanthology.org/D17-1161.pdf)].### 6.3 General-purpose Language Models
General-purpose language models are also one of the most attractive applications of instruction learning, e.g., [ChatGPT](https://chat.openai.com/chat), which can align nicely with human values.
1. **Sparks of Artificial General Intelligence: Early experiments with GPT-4.** *Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2303.12712.pdf)].
2. **GPT-4 Technical Report.** *OpenAI.* Preprint 2023. [[pdf](https://cdn.openai.com/papers/gpt-4.pdf)]; [[blog](https://openai.com/research/gpt-4)].
3. **The Wisdom of Hindsight Makes Language Models Better Instruction Followers.** *Tianjun Zhang, Fangchen Liu, Justin Wong, Pieter Abbeel, and Joseph E. Gonzalez.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.05206.pdf)]; [[code](https://github.com/tianjunz/HIR)].
4. **Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models.** *Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, and Bryan Catanzaro.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.07388.pdf)].
5. **Training language models to follow instructions with human feedback.** *Long Ouyang, Jeffrey Wu, Xu Jiang, and et al.* NeurIPS 2022. [[pdf](https://openreview.net/pdf?id=TG8KACxEON)].### 6.4 Other Papers
1. **GPTScore: Evaluate as You Desire.** *Jinlan Fu, See-Kiong Ng, Zhengbao Jiang, and Pengfei Liu.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.04166.pdf)]; [[code](https://github.com/jinlanfu/GPTScore)].
2. **MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning.** *Zhiyang Xu, Ying Shen, and Lifu Huang.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.10773.pdf)].
3. **Task-aware Retrieval with Instructions.** *Akari Asai, Timo Schick, Patrick Lewis, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2211.09260.pdf)]; [[code](https://github.com/facebookresearch/tart)].
4. **UnifiedABSA: A Unified ABSA Framework Based on Multi-task Instruction Tuning.** *Zengzhi Wang, Rui Xia, and Jianfei Yu.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2211.10986.pdf)].
5. **In-Context Learning for Few-Shot Dialogue State Tracking.** *Yushi Hu, Chia-Hsuan Lee, Tianbao Xie, Tao Yu, Noah A. Smith, and Mari Ostendorf.* Findings of EMNLP 2022. [[pdf](https://arxiv.org/pdf/2203.08568.pdf)]; [[code](https://github.com/Yushi-Hu/IC-DST)].
6. **Few-shot Learning with Multilingual Language Models.** *Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, and et al.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2112.10668.pdf)]; [[code](https://github.com/facebookresearch/fairseq/tree/main/examples/xglm)].
7. **UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models.** *Tianbao Xie, Chen Henry Wu, Peng Shi, and et al.* EMNLP 2022. [[pdf](https://arxiv.org/pdf/2201.05966.pdf)]; [[code](https://github.com/HKUNLP/UnifiedSKG)]; [[website](https://unifiedskg.com/)].
8. **In-BoXBART: Get Instructions into Biomedical Multi-Task Learning .** *Mihir Parmar, Swaroop Mishra, Mirali Purohit, Man Luo, M. Hassan Murad, and Chitta Baral.* Findings of NAACL 2022. [[pdf](https://arxiv.org/pdf/2204.07600.pdf)]; [[code](https://github.com/Mihir3009/In-BoXBART)].## 7. 📖 Extended Reading
We also share some other awesome papers that might inspire the future work.
### 7.1 Instruction Induction
1. **Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners.** *Seonghyeon Ye, Doyoung Kim, Joel Jang, Joongbo Shin, and Minjoon Seo.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2210.02969.pdf)]; [[code](https://github.com/seonghyeonye/Flipped-Learning)].
2. **Instruction Induction: From Few Examples to Natural Language Task Descriptions.** *Or Honovich, Uri Shaham, Samuel R. Bowman, and Omer Levy.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2205.10782.pdf)]; [[code](https://github.com/orhonovich/instruction-induction)].
3. **Learning to Decompose and Organize Complex Tasks.** *Yi Zhang, Sujay Kumar Jauhar, Julia Kiseleva, Ryen White, and Dan Roth.* NAACL 2021. [[pdf](https://aclanthology.org/2021.naacl-main.217.pdf)]; [[corpus](https://github.com/microsoft/MSComplexTasks)].
4. **Analogous Process Structure Induction for Sub-event Sequence Prediction.** *Hongming Zhang, Muhao Chen, Haoyu Wang, Yangqiu Song, and Dan Roth.* EMNLP 2020. [[pdf](https://aclanthology.org/2020.emnlp-main.119.pdf)]; [[code](https://cogcomp.github.io/APSI/)].### 7.2 ChatGPT-related Papers
Nowdays, ChatGPT is a super star 🌟 in the NLP community. Since there is no official paper for ChatGPT, we share some frontier works that can provide deep insights into ChatGPT.
1. **When do you need Chain-of-Thought Prompting for ChatGPT?** *Jiuhai Chen, Lichang Chen, Heng Huang, and Tianyi Zhou.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2304.03262.pdf)].
2. **Toxicity in ChatGPT: Analyzing Persona-assigned Language Models.** *Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, and Karthik Narasimhan.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2304.05335.pdf)].
3. **Is ChatGPT a General-Purpose Natural Language Processing Task Solver?** *Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, and Diyi Yang.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.06476.pdf)].
4. **How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection.** *Biyang Guo, Xin Zhang, Ziyuan Wang, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2301.07597.pdf)]; [[corpus](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection)].
5. **ChatGPT: Jack of all trades, master of none.** *Jan Kocoń, Igor Cichecki, Oliwier Kaszyca, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.10724.pdf)].
6. **On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective.** *Jindong Wang, Xixu Hu, Wenxin Hou, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.12095.pdf)]; [[code](https://github.com/microsoft/robustlearn)].### 7.3 Human Feedback vs. Model Feedback
1. **Aligning Large Language Models through Synthetic Feedback.** *Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, and Minjoon Seo.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.13735.pdf)].
2. **LIMA: Less Is More for Alignment.** *Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.11206.pdf)].
3. **Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision.** *Zhiqing Sun, Yikang Shen, Qinhong Zhou, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2305.03047.pdf)]; [[code](https://github.com/IBM/Dromedary)].
4. **Chain of Hindsight Aligns Language Models with Feedback.** *Hao Liu, Carmelo Sferrazza, and Pieter Abbeel.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.02676.pdf)]; [[code](https://github.com/lhao499/CoH)].
5. **Pretraining Language Models with Human Preferences.** *Tomasz Korbak, Kejian Shi, Angelica Chen, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.08582.pdf)].
6. **Constitutional AI: Harmlessness from AI Feedback.** *Yuntao Bai, Saurav Kadavath, Sandipan Kundu, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2212.08073.pdf)]; [[corpus](https://github.com/anthropics/ConstitutionalHarmlessnessPaper)].
7. **Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback.** *Yuntao Bai, Andy Jones, Kamal Ndousse, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2204.05862.pdf)]; [[corpus](https://github.com/anthropics/hh-rlhf)].### 7.4 Scalable Oversight and Alignment
1. **Measuring Progress on Scalable Oversight for Large Language Models.** *Samuel R. Bowman, Jeeyoon Hyun, Ethan Perez, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2211.03540.pdf)].
2. **Aligning AI With Shared Human Values.** *Dan Hendrycks, Collin Burns, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, and Jacob Steinhardt.* ICLR 2021. [[pdf](https://openreview.net/pdf?id=dNy_RKzJacY)].
### 7.5 Other Papers
1. **Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models.** *Kaitlyn Zhou, Dan Jurafsky, and Tatsunori Hashimoto.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.13439.pdf)].
2. **The Capacity for Moral Self-Correction in Large Language Models.** *Deep Ganguli, Amanda Askell, Nicholas Schiefer, and et al.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.07459.pdf)].
3. **Large Language Models Can Be Easily Distracted by Irrelevant Context.** *Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed Chi, Nathanael Schärli, and Denny Zhou.* Preprint 2023. [[pdf](https://arxiv.org/pdf/2302.00093.pdf)]; [[corpus](https://github.com/google-research-datasets/GSM-IC)].4. **Language Models (Mostly) Know What They Know.** *Saurav Kadavath, Tom Conerly, Amanda Askell, and et al.* Preprint 2022. [[pdf](https://arxiv.org/pdf/2207.05221.pdf)].
---
## ⭐ Star History
[![Star History Chart](https://api.star-history.com/svg?repos=RenzeLou/awesome-instruction-learning&type=Date)](https://star-history.com/#RenzeLou/awesome-instruction-learning&Date)