Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-finetuning
A curated list of resources on fine-tuning language models.
https://github.com/mmarius/awesome-finetuning
Last synced: 4 days ago
JSON representation
-
Fine-tuning before transformers
- Semi-supervised Sequence Learning - b31b1b)
- How Transferable are Neural Networks in NLP Applications? - yellow)
- Improving Neural Machine Translation Models with Monolingual Data - blue)
- Question Answering through Transfer Learning from Large Fine-grained Supervision Data - blue)
- Universal Language Model Fine-tuning for Text Classification - blue)
- An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models - purple)
-
Fine-tuning transformers
-
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - purple)
- Better Fine-Tuning by Reducing Representational Collapse - b31b1b)
- FreeLB: Enhanced Adversarial Training for Natural Language Understanding - green)
- SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization - blue)
- Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning - green)
-
Intermediate task fine-tuning
- Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks - b31b1b)
- Transfer Fine-Tuning: A BERT Case Study - yellow)
- Learning and Evaluating General Linguistic Intelligence - b31b1b)
- Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work? - blue)
- English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too - b31b1b)
- What to Pre-Train on? Efficient Intermediate Task Selection - yellow)
- Is Supervised Syntactic Parsing Beneficial for Language Understanding Tasks? An Empirical Investigation - orange)
- Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling - yellow)
- Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks - blue)
- Mining Knowledge for Natural Language Inference from Wikipedia Categories - ACL-blue)
- Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank - EMNLP-yellow)
- Train No Evil: Selective Masking for Task-Guided Pre-Training - yellow)
- Injecting Numerical Reasoning Skills into Language Models - blue)
- Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers - EMNLP-yellow)
- Analyzing Commonsense Emergence in Few-shot Knowledge Models - brown)
-
Parameter-efficient fine-tuning
- Parameter-Efficient Transfer Learning for NLP - cyan)
- BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning - cyan)
- Simple, Scalable Adaptation for Neural Machine Translation - yellow)
- Masking as an Efficient Alternative to Finetuning for Pretrained Language Models - yellow)
- Movement Pruning: Adaptive Sparsity by Fine-Tuning - brightgreen)
- AdapterFusion: Non-Destructive Task Composition for Transfer Learning - orange)
- MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer - yellow)
- AdapterDrop: On the Efficiency of Adapters in Transformers - yellow)
- Parameter-efficient transfer learning with diff pruning - blue)
- Compacter: Efficient Low-Rank Hypercomplex Adapter Layers - brightgreen)
- LoRA: Low-Rank Adaptation of Large Language Models - b31b1b)
- BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models - blue)
- Training Neural Networks with Fixed Sparse Masks - brightgreen)
- Towards a Unified View of Parameter-Efficient Transfer Learning - green)
- Composable Sparse Fine-Tuning for Cross-Lingual Transfer - blue)
- Revisiting Parameter-Efficient Tuning: Are We Really There Yet? - b31b1b)
- Prompt-free and Efficient Few-shot Learning with Language Models - blue)
- Adaptable Adapters - purple)
- Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning - b31b1b)
-
Prompt-based fine-tuning
- Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference - orange)
- It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners - purple)
- Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification - pink)
- Few-Shot Text Generation with Natural Language Instructions - yellow)
- Making Pre-trained Language Models Better Few-shot Learners - blue)
- AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts - b31b1b)
- How Many Data Points is a Prompt Worth? - purple)
- Improving and Simplifying Pattern Exploiting Training - yellow)
- Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections - EMNLP-yellow)
- Calibrate Before Use: Improving Few-Shot Performance of Language Models - cyan)
- PTR: Prompt Tuning with Rules for Text Classification - b31b1b)
- Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models - b31b1b)
- Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification - b31b1b)
- Prompt-Learning for Fine-Grained Entity Typing - b31b1b)
- Do Prompt-Based Models Really Understand the Meaning of their Prompts? - purple)
- Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning - yellow)
- Prototypical Verbalizer for Prompt-based Few-shot Tuning - blue)
- Cross-Task Generalization via Natural Language Crowdsourcing Instructions - b31b1b)
- Discrete and Soft Prompting for Multilingual Models - yellow)
- Finetuned Language Models Are Zero-Shot Learners - green)
- Multitask Prompted Training Enables Zero-Shot Task Generalization - green)
- Prompt Consistency for Zero-Shot Task Generalization - b31b1b)
- Few-shot Adaptation Works with UnpredicTable Data - b31b1b)
- Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks - b31b1b)
- Prefix-Tuning: Optimizing Continuous Prompts for Generation - b31b1b)
- WARP: Word-level Adversarial ReProgramming - blue)
- Learning How to Ask: Querying LMs with Mixtures of Soft Prompts - purple)
- Factual Probing Is [MASK - purple)
- The Power of Scale for Parameter-Efficient Prompt Tuning - yellow)
- Multimodal Few-Shot Learning with Frozen Language Models - brightgreen)
- Noisy Channel Language Model Prompting for Few-Shot Text Classification - b31b1b)
- Continuous Entailment Patterns for Lexical Inference in Context - yellow)
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners - green)
- SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer - blue)
- P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks - blue)
-
Evaluating few-shot fine-tuning
-
Fine-tuning analysis
- Visualizing and Understanding the Effectiveness of BERT - yellow)
- oLMpics-On What Language Model Pre-training Captures - grey)
- Pretrained Transformers Improve Out-of-Distribution Robustness - blue)
- What Happens To BERT Embeddings During Fine-tuning? - -NLP-EMNLP-yellow)
- Investigating Learning Dynamics of BERT Fine-Tuning - orange)
- Investigating Transferability in Pretrained Language Models - EMNLP-yellow)
- Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning - blue)
- Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers - -NLP-EMNLP-yellow)
- Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution - green)
- When Do You Need Billions of Words of Pretraining Data? - blue)
- On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation - blue)
- Pretrained Transformers as Universal Computation Engines - b31b1b)
- Predicting Inductive Biases of Pre-Trained Models - green)
- Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping - b31b1b)
- On the Importance of Data Size in Probing Fine-tuned Models - ACL-blue)
- BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance - -NLP-EMNLP-yellow)
- Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics - EMNLP-yellow)
- An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models - grey)
- Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually) - yellow)
-
-
Theoretical work
-
Surveys
-
Fine-tuning analysis
-
-
Misc.
-
Fine-tuning analysis
-
-
Disclaimer