https://github.com/prateeknigam9/transformers-from-scratch

This repository showcases an encoder-only transformer model built from scratch, focusing on intent classification.
https://github.com/prateeknigam9/transformers-from-scratch

attention-mechanism encoder-decoder-model intent-classification llm pytorch transformers

Last synced: 11 months ago
JSON representation

This repository showcases an encoder-only transformer model built from scratch, focusing on intent classification.

Host: GitHub
URL: https://github.com/prateeknigam9/transformers-from-scratch
Owner: prateeknigam9
Created: 2024-09-30T16:10:41.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-10-24T04:08:02.000Z (over 1 year ago)
Last Synced: 2024-10-24T18:05:29.115Z (over 1 year ago)
Topics: attention-mechanism, encoder-decoder-model, intent-classification, llm, pytorch, transformers
Language: Jupyter Notebook
Homepage:
Size: 71 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: readme.md

Awesome Lists containing this project

README

# Transformers from Scratch

Hi!
This repository contains an implementation of transformer models built from scratch, divided into three main components:
1. **Encoder-Only Model**
2. **Decoder-Only Model**
3. **Combined Encoder-Decoder Model**

Transformers have revolutionized natural language processing and are foundational to many state-of-the-art models. This project aims to provide a clear and modular implementation of transformers, inspired by the work of
[Sebastian Raschka.](https://sebastianraschka.com/)

## Motivation

I started this project to better understand how transformers work. I wanted to learn the details of each step involved in building a transformer model. By coding everything from scratch, I could see how the different parts fit together and how they function.

My goal was to become more skilled at working with transformers so that I could apply this knowledge to real-world tasks and research. By understanding the inner workings of transformers, I can also explore ways to improve or customize them for specific needs. Overall, this project helped me gain a deeper appreciation for this important technology in natural language processing.

## Features -

**Modular Code**: Each component of the transformer is implemented in a modular format for easy understanding and extensibility.
**Customizable Architecture**: The models can be easily modified to suit specific tasks or research interests.

## Directory Structure

/transformers-from-scratch
│
├── encoder_only/
│ ├── model.py
│ ├── train.py
│ └── evaluate.py
│
├── decoder_only/
│ ├── model.py
│ ├── train.py
│ └── evaluate.py
│
└── combined_model/
├── model.py
├── train.py
└── evaluate.py

Thankyou!
[Prateek Nigam](https://prateeknigam9.github.io/PrateekNigam)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/prateeknigam9/transformers-from-scratch

Awesome Lists containing this project

README