Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Awesome-instruction-tuning
A curated list of awesome instruction tuning datasets, models, papers and repositories.
https://github.com/zhilizju/Awesome-instruction-tuning
Last synced: about 5 hours ago
JSON representation
-
Datasets and Models
-
Modified from Traditional NLP
- Longpre et al.
- Natural Inst v1.0
- Flan 2021 - LaMDA | LaMDA | 137B |
- Flan 2022 - T5, Flan-PaLM | T5-LM, PaLM | 10 M-540 B |
- xP3 - 176B |
- UnifiedQA - 340 M |
- CrossFit - CrossFit | BART | 140 M |
- P3 - LM| 3-11B |
- MetalCL - 2 | 770 M |
- ExMix - 11B |
- Super-Natural Inst. - Instruct | T5-LM, mT5 | 17-13B |
- GLM - 130B | GLM | 130 B |
- Unnatural Inst. - LM-Unnat. Inst. | T5-LM | 11B |
- Longpre et al.
-
Generated by LLMs
- guanaco
- alpaca - lab/stanford_alpaca/blob/main/alpaca_data.json)| 52 k | En |
- Chinese-Vicuna
- Alpaca-CoT - CoT#statistics) | ---- | En Zh |
- dolly - lab/stanford_alpaca/blob/main/alpaca_data.json)| 52 k|En |
- ColossalChat
- Luotuo - alpaca-lora/blob/main/data/trans_chinese_alpaca_data.json) | 52k | Zh |
- cerebras-lora-alpaca - GPT | 2.7B | [AlpacaDataCleaned](https://github.com/gururise/AlpacaDataCleaned) | 52k | En |
- alpaca-lora - lab/stanford_alpaca/blob/main/alpaca_data.json)、[alpaca_data_cleaned](https://github.com/tloen/alpaca-lora/blob/main/alpaca_data_cleaned.json) |52 k | En|
- Chinese-LLaMA-Alpaca - LLaMA-Alpaca/tree/main/data)、[pCLUE](https://github.com/CLUEbenchmark/pCLUE)、[translation2019zh](https://github.com/brightmart/nlp_chinese_corpus#5%E7%BF%BB%E8%AF%91%E8%AF%AD%E6%96%99translation2019zh)、[alpaca_data](https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json)、Self-Instruct | 2M | Zh |
-
-
Papers
-
Multilingual tools
- **Finetuned language models are zero-shot learners**
- **Multitask Prompted Training Enables Zero-Shot Task Generalization**
- **Training language models to follow instructions with human feedback**
- **Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks**
- **Unsupervised Cross-Task Generalization via Retrieval Augmentation**
- **Instruction Induction: From Few Examples to Natural Language Task Descriptions**
- **Scaling Instruction-Finetuned Language Models**
- **Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners**
- **Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor**
- **Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations**
- **Self-Instruct: Aligning Language Model with Self Generated Instructions**
- **MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning**
- **The Flan Collection: Designing Data and Methods for Effective Instruction Tuning**
- **In-Context Instruction Learning**
-
-
Repositories
-
Instruction
-
ICL
-
Reason
-
Framework
-
Programming Languages
Categories
Sub Categories
Keywords
in-context-learning
5
llm
4
llama
4
large-language-models
4
language-model
3
alpaca
3
gpt-3
3
nlp
3
chatgpt
3
chain-of-thought
3
instruction-tuning
3
prompt-learning
3
chatbot
2
prompt
2
paper-list
2
datasets
2
awesome-list
2
language-modeling
2
prompt-engineering
2
lora
2
cot
2
instruction-following
2
deep-learning
2
natural-language-processing
2
ai
1
gpt
1
big-model
1
data-parallelism
1
distributed-computing
1
dolly
1
databricks
1
foundation-models
1
tabular-model
1
tabular-data
1
tabul
1
pytorch
1
parameter-efficient
1
p-tuning
1
moss
1
chatglm
1
vicuna
1
chinese
1
machine-learning
1
quantization
1
pre-trained-language-models
1
plm
1
llama-2
1
alpaca-2
1
prompt-tuning
1
prompt-toolkit
1