https://github.com/prateeknigam9/transformers-from-scratch
This repository showcases an encoder-only transformer model built from scratch, focusing on intent classification.
https://github.com/prateeknigam9/transformers-from-scratch
attention-mechanism encoder-decoder-model intent-classification llm pytorch transformers
Last synced: 11 months ago
JSON representation
This repository showcases an encoder-only transformer model built from scratch, focusing on intent classification.
- Host: GitHub
- URL: https://github.com/prateeknigam9/transformers-from-scratch
- Owner: prateeknigam9
- Created: 2024-09-30T16:10:41.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-24T04:08:02.000Z (over 1 year ago)
- Last Synced: 2024-10-24T18:05:29.115Z (over 1 year ago)
- Topics: attention-mechanism, encoder-decoder-model, intent-classification, llm, pytorch, transformers
- Language: Jupyter Notebook
- Homepage:
- Size: 71 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Transformers from Scratch
Hi!
This repository contains an implementation of transformer models built from scratch, divided into three main components:
1. **Encoder-Only Model**
2. **Decoder-Only Model**
3. **Combined Encoder-Decoder Model**
Transformers have revolutionized natural language processing and are foundational to many state-of-the-art models. This project aims to provide a clear and modular implementation of transformers, inspired by the work of
[Sebastian Raschka.](https://sebastianraschka.com/)
## Motivation
I started this project to better understand how transformers work. I wanted to learn the details of each step involved in building a transformer model. By coding everything from scratch, I could see how the different parts fit together and how they function.
My goal was to become more skilled at working with transformers so that I could apply this knowledge to real-world tasks and research. By understanding the inner workings of transformers, I can also explore ways to improve or customize them for specific needs. Overall, this project helped me gain a deeper appreciation for this important technology in natural language processing.
## Features -
**Modular Code**: Each component of the transformer is implemented in a modular format for easy understanding and extensibility.
**Customizable Architecture**: The models can be easily modified to suit specific tasks or research interests.
## Directory Structure
/transformers-from-scratch
│
├── encoder_only/
│ ├── model.py
│ ├── train.py
│ └── evaluate.py
│
├── decoder_only/
│ ├── model.py
│ ├── train.py
│ └── evaluate.py
│
└── combined_model/
├── model.py
├── train.py
└── evaluate.py
Thankyou!
[Prateek Nigam](https://prateeknigam9.github.io/PrateekNigam)