Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-finetuning

A curated list of resources on fine-tuning language models.
https://github.com/mmarius/awesome-finetuning

Last synced: 4 days ago
JSON representation

Fine-tuning before transformers
Fine-tuning transformers
- Intermediate task fine-tuning
- Parameter-efficient fine-tuning
  - Parameter-Efficient Transfer Learning for NLP - cyan)
  - BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning - cyan)
  - Simple, Scalable Adaptation for Neural Machine Translation - yellow)
  - Masking as an Efficient Alternative to Finetuning for Pretrained Language Models - yellow)
  - Movement Pruning: Adaptive Sparsity by Fine-Tuning - brightgreen)
  - AdapterFusion: Non-Destructive Task Composition for Transfer Learning - orange)
  - MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer - yellow)
  - AdapterDrop: On the Efficiency of Adapters in Transformers - yellow)
  - Parameter-efficient transfer learning with diff pruning - blue)
  - Compacter: Efficient Low-Rank Hypercomplex Adapter Layers - brightgreen)
  - LoRA: Low-Rank Adaptation of Large Language Models - b31b1b)
  - BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models - blue)
  - Training Neural Networks with Fixed Sparse Masks - brightgreen)
  - Towards a Unified View of Parameter-Efficient Transfer Learning - green)
  - Composable Sparse Fine-Tuning for Cross-Lingual Transfer - blue)
  - Revisiting Parameter-Efficient Tuning: Are We Really There Yet? - b31b1b)
  - Prompt-free and Efficient Few-shot Learning with Language Models - blue)
  - Adaptable Adapters - purple)
  - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning - b31b1b)
- Prompt-based fine-tuning
  - Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference - orange)
  - It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners - purple)
  - Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification - pink)
  - Few-Shot Text Generation with Natural Language Instructions - yellow)
  - Making Pre-trained Language Models Better Few-shot Learners - blue)
  - AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts - b31b1b)
  - How Many Data Points is a Prompt Worth? - purple)
  - Improving and Simplifying Pattern Exploiting Training - yellow)
  - Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections - EMNLP-yellow)
  - Calibrate Before Use: Improving Few-Shot Performance of Language Models - cyan)
  - PTR: Prompt Tuning with Rules for Text Classification - b31b1b)
  - Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models - b31b1b)
  - Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification - b31b1b)
  - Prompt-Learning for Fine-Grained Entity Typing - b31b1b)
  - Do Prompt-Based Models Really Understand the Meaning of their Prompts? - purple)
  - Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning - yellow)
  - Prototypical Verbalizer for Prompt-based Few-shot Tuning - blue)
  - Cross-Task Generalization via Natural Language Crowdsourcing Instructions - b31b1b)
  - Discrete and Soft Prompting for Multilingual Models - yellow)
  - Finetuned Language Models Are Zero-Shot Learners - green)
  - Multitask Prompted Training Enables Zero-Shot Task Generalization - green)
  - Prompt Consistency for Zero-Shot Task Generalization - b31b1b)
  - Few-shot Adaptation Works with UnpredicTable Data - b31b1b)
  - Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks - b31b1b)
  - Prefix-Tuning: Optimizing Continuous Prompts for Generation - b31b1b)
  - WARP: Word-level Adversarial ReProgramming - blue)
  - Learning How to Ask: Querying LMs with Mixtures of Soft Prompts - purple)
  - Factual Probing Is [MASK - purple)
  - The Power of Scale for Parameter-Efficient Prompt Tuning - yellow)
  - Multimodal Few-Shot Learning with Frozen Language Models - brightgreen)
  - Noisy Channel Language Model Prompting for Few-Shot Text Classification - b31b1b)
  - Continuous Entailment Patterns for Lexical Inference in Context - yellow)
  - Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners - green)
  - SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer - blue)
  - P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks - blue)
- Evaluating few-shot fine-tuning
  - True Few-Shot Learning with Language Models - brightgreen)
  - FLEX: Unifying Evaluation for Few-Shot NLP - brightgreen)
  - FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding - blue)
  - True Few-Shot Learning with Prompts—A Real-World Perspective - white)
- Fine-tuning analysis
  - Visualizing and Understanding the Effectiveness of BERT - yellow)
  - oLMpics-On What Language Model Pre-training Captures - grey)
  - Pretrained Transformers Improve Out-of-Distribution Robustness - blue)
  - What Happens To BERT Embeddings During Fine-tuning? - -NLP-EMNLP-yellow)
  - Investigating Learning Dynamics of BERT Fine-Tuning - orange)
  - Investigating Transferability in Pretrained Language Models - EMNLP-yellow)
  - Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning - blue)
  - Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers - -NLP-EMNLP-yellow)
  - Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution - green)
  - When Do You Need Billions of Words of Pretraining Data? - blue)
  - On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation - blue)
  - Pretrained Transformers as Universal Computation Engines - b31b1b)
  - Predicting Inductive Biases of Pre-Trained Models - green)
  - Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping - b31b1b)
  - On the Importance of Data Size in Probing Fine-tuned Models - ACL-blue)
  - BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance - -NLP-EMNLP-yellow)
  - Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics - EMNLP-yellow)
  - An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models - grey)
  - Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually) - yellow)
Theoretical work
- Fine-tuning analysis
  - A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks - green)
  - Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning - b31b1b)
Surveys
- Fine-tuning analysis
Misc.
- Fine-tuning analysis
  - What is being transferred in transfer learning? - brightgreen)
  - Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge - brightgreen)
  - Exploring and Predicting Transferability across NLP Tasks - yellow)
Disclaimer
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
- On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers

Categories

Fine-tuning transformers 97 Fine-tuning before transformers 6 Surveys 3 Misc. 3 Theoretical work 2 Disclaimer 2

Sub Categories

Prompt-based fine-tuning 35 Fine-tuning analysis 27 Parameter-efficient fine-tuning 19 Intermediate task fine-tuning 15 Evaluating few-shot fine-tuning 4