Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/agasheaditya/handson-transformers
End-to-end implementation of Transformers using PyTorch from scratch
https://github.com/agasheaditya/handson-transformers
deep-learning nlp python3 pytorch streamlit transformers
Last synced: 14 days ago
JSON representation
End-to-end implementation of Transformers using PyTorch from scratch
- Host: GitHub
- URL: https://github.com/agasheaditya/handson-transformers
- Owner: agasheaditya
- License: mit
- Created: 2024-08-26T20:35:04.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-09-04T17:28:27.000Z (5 months ago)
- Last Synced: 2024-09-06T00:39:34.139Z (5 months ago)
- Topics: deep-learning, nlp, python3, pytorch, streamlit, transformers
- Language: Jupyter Notebook
- Homepage: https://handson-transformers-production.up.railway.app/
- Size: 2.37 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Hands-on Transformers
**End-to-end implementation of Transformers using PyTorch from scratch**References:
- Transformers Blog: https://pastoral-cloudberry-567.notion.site/Transformers-fdc33a784ae64e138bd6bf1e19f2bbdf
- Attention is all you need: https://arxiv.org/pdf/1706.03762
- HuggingFace Course: https://huggingface.co/learn/nlp-course/en/chapter1/3?fw=pt
---Implementing end to end Transformer model using PyTorch from scratch, and training it to generate paragraphs if given a keyword or phrase as a input.
### Files and usage:
- **TransformerModel.py** --> Model class containing all logic and architecture of Transformer model
- **train_beta.ipynb** --> Jupyter Notebook to train and do the sample inference on trained model
- **trained-transformer_model.pth** --> Trained model checkpoint _(saved state dict)_
- **Articles.xlsx** --> Dataset used to train the model (https://www.kaggle.com/datasets/asad1m9a9h6mood/news-articles)
- **requirements.txt** --> pip freeze of dependencies
---### Working:
_The model takes a keyword or phrase, tokenizes it, and then iteratively generates text by predicting the next token in the sequence.
The model uses embedding, positional encoding, and an encoder-decoder architecture to generate coherent text.
Sampling strategies like temperature scaling and top-k sampling help to produce varied and natural outputs._### Setup and Usage:
* Hardware used:
- CPU: Intel i7-10750H (2.60 GHz)
- RAM: 16 GB
- GPU: NVIDIA GeForce RTX 2060 (6 GB)
* Create virtual environment
```code
virtualenv env
```* Activate virtual environment
```code
./env/Scripts/activate
```* Installing dependancies
```code
pip install -r requirements.txt
```
---### Dashboard:
Dashboard which can generate the paragrph using the trained model if given a keyword or phrase as a input.
* Running a Streamlit app
```code
streamlit run app.py
```![Streamlit App](https://github.com/user-attachments/assets/7d373d93-5bdb-4f27-8686-55547e30801f)