https://github.com/mrdbourke/learn-transformers

Work in progress. Simple repository to learn Transformers (and transformers).
https://github.com/mrdbourke/learn-transformers

Last synced: about 2 months ago
JSON representation

Work in progress. Simple repository to learn Transformers (and transformers).

Host: GitHub
URL: https://github.com/mrdbourke/learn-transformers
Owner: mrdbourke
License: mit
Created: 2023-06-09T03:27:26.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2023-07-27T03:49:47.000Z (over 1 year ago)
Last Synced: 2025-03-05T18:58:56.129Z (about 2 months ago)
Language: Jupyter Notebook
Size: 688 KB
Stars: 41
Watchers: 3
Forks: 11
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Learn Transformers (work-in-progress)

When I was growing up Transformers were cars that turned into robots.

Now they're the backbone of every machine learning and AI app.

The goal of this repo will be to learn (for myself) and provide simple resources for others on*:

1. The attention mechanism and the original Transformer architecture.
2. Various Transformer-based models (e.g. GPT).
3. The [`transformers`](https://huggingface.co/docs/transformers/index) library by Hugging Face (many different types of models here but why not?).

1 & 2 will be more research focused where as 3 will be very practically applicable.

\*Outline subject to change.

## Prerequsites

Assumes basic knowledge of PyTorch (or any other ML framework) and deep learning in general.

See [learnpytorch.io](https://learnpytorch.io) for a beginner-friendly intro.

Or my [Learn PyTorch in a day video on YouTube](https://youtu.be/Z_ikDlimN6A) to get up to speed and then come back here.

## Resources

Some of the resources I've found useful (this will grow overtime).

* Transformer paper: https://arxiv.org/abs/1706.03762
* The annotated transformer: http://nlp.seas.harvard.edu/2018/04/01/attention.html
* Transformers from scratch: https://peterbloem.nl/blog/transformers
* Attention functions in PyTorch code: https://github.com/sooftware/attentions/blob/master/attentions.py
* xFormers by Facebook Research, an in-depth implementation of many Transformer Architecture components: https://github.com/facebookresearch/xformers
* https://lilianweng.github.io/posts/2018-06-24-attention/#self-attention
* https://jaykmody.com/blog/attention-intuition/

## Extras

* Modifications to original Transformer architecture (warning: there are lots) - https://arxiv.org/abs/2102.11972

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mrdbourke/learn-transformers

Awesome Lists containing this project

README