https://github.com/gaussalgo/l2l_mlprague23
Materials for the Learning to Learn workshop on Machine Learning Prague 2023.
https://github.com/gaussalgo/l2l_mlprague23
Last synced: 16 days ago
JSON representation
Materials for the Learning to Learn workshop on Machine Learning Prague 2023.
- Host: GitHub
- URL: https://github.com/gaussalgo/l2l_mlprague23
- Owner: gaussalgo
- License: mit
- Created: 2023-05-12T08:07:41.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-06-08T12:46:33.000Z (almost 2 years ago)
- Last Synced: 2025-03-29T02:04:30.452Z (about 1 month ago)
- Language: Jupyter Notebook
- Size: 5.04 MB
- Stars: 6
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Learning to Learn workshop at MLPrague 2023
This repo contains link to the materials for the Learning to Learn workshop on Machine Learning Prague 2023.## Getting started
The primary playground env for the exercises below is [Google Colab](https://colab.research.google.com).
The linked Colab notebooks contain the resolution of dependences, but if you'd like to run the exercises elsewhere, simply install the attached `requirements.txt` into any environment:```shell
git clone https://github.com/gaussalgo/L2L_MLPrague23.git
pip install -r L2L_MLPrague23/requirements.txt
```## Outline
### 1. Intro to Transformers
[](https://colab.research.google.com/github/gaussalgo/L2L_MLPrague23/blob/main/notebooks/transformers_intro.ipynb)- Architectures
- Difference to other arch's (attention layer)
- Tasks (=objectives)- Pre-training & Fine-tuning
- Inputs and outputs
- Single token prediction- Generation
- Iterative prediction
- Other generation strategies
- [Hands-on] Constraining generated output (forcing & disabling)### 2. In-context Learning and Few-shot Learning with Transformers
[](https://colab.research.google.com/github/gaussalgo/L2L_MLPrague23/blob/main/notebooks/ICL_intro.ipynb)- Problem definition (usage)
- Contrast with Supervised ML
- Zero-shot vs few-shot
- Examples
- [Hands-on] comparison of zero-shot vs. few-shot performance (of some chosen ICL)### 3. Methods for Improving ICL
#### Inference
- Demonstrations heterogeneity
- Prompt engineering
- Promptsource - database of prompts?
- [Hands-on] prompt engineering (inspired by the training data?)#### Training Strategies
[](https://colab.research.google.com/github/gaussalgo/L2L_MLPrague23/blob/main/notebooks/existing_ICL_models.ipynb)- Training strategies + existing models
- training in explicit fewshot format (QA)
- Instruction tuning
- Multitask learning
- Chain-of-Thought
- Pre-training on a code
- Fine-tuning with human feedback
#### Theory behind - why does ICL exist?
- Data properties fostering ICL
- Experiments
- Explanations of the existing models?### 4. Hands-on in Improving Few-shot ICL
[](https://colab.research.google.com/github/gaussalgo/L2L_MLPrague23/blob/main/notebooks/hands_on_improving_ICL.ipynb)- [Hands-on] Customizing Few-shot ICL to specialized data
- Practical training pipeline
- Overview of the training pipeline
- Adaptor example-------
### Models evaluation & competition [Optional]
If you trained your own great few-shot ICL model, it would be a pity not to test it on some unseen reasoning tasks.
See the [competition readme](competition) for how to evaluate the model and if it beats the baseline, how to spread the word!
-------