{"id":23902532,"url":"https://github.com/adrianocleao/transformers-from-scratch","last_synced_at":"2025-09-20T03:41:32.359Z","repository":{"id":251445335,"uuid":"837441690","full_name":"AdrianoCLeao/transformers-from-scratch","owner":"AdrianoCLeao","description":"This repository is dedicated to reconstructing the Transformers architecture from the ground up using PyTorch. Based on the model presented in the \"Attention is All You Need\" paper, this project aims to better understand the architecture of one of the most important advancements in NLP programming.","archived":false,"fork":false,"pushed_at":"2024-08-05T05:58:10.000Z","size":27,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-23T10:44:04.044Z","etag":null,"topics":["artificial-intelligence","attention-is-all-you-need","machine-learning","neural-network","nlp","pythorch","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AdrianoCLeao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-03T02:14:49.000Z","updated_at":"2024-12-15T23:17:03.000Z","dependencies_parsed_at":"2025-01-04T22:48:21.270Z","dependency_job_id":"60640429-2786-4500-8b1c-321ccba25b8d","html_url":"https://github.com/AdrianoCLeao/transformers-from-scratch","commit_stats":null,"previous_names":["adrianocleao/transformersfromscratch","adrianocleao/transformers-from-scratch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/AdrianoCLeao/transformers-from-scratch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianoCLeao%2Ftransformers-from-scratch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianoCLeao%2Ftransformers-from-scratch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianoCLeao%2Ftransformers-from-scratch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianoCLeao%2Ftransformers-from-scratch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AdrianoCLeao","download_url":"https://codeload.github.com/AdrianoCLeao/transformers-from-scratch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianoCLeao%2Ftransformers-from-scratch/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263421503,"owners_count":23464013,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","attention-is-all-you-need","machine-learning","neural-network","nlp","pythorch","transformers"],"created_at":"2025-01-04T22:48:20.811Z","updated_at":"2025-09-20T03:41:27.321Z","avatar_url":"https://github.com/AdrianoCLeao.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Transformer from Scratch with PyTorch\n\u003cimg src=\"https://github.com/user-attachments/assets/f5e88463-9f28-44b9-8dbc-240da904e859\" alt=\"transformer\" width=\"450\"\u003e\n\n## Overview\nThis project implements an English-to-Portuguese translation system using a Transformer model built from scratch. The model was developed based on the paper \"Attention Is All You Need\" (Vaswani et al., 2017), which introduced the Transformer architecture for translation and other natural language processing tasks.\n\n## Project Description\n\nThe goal of this project is to build a machine translation model that can accurately translate English texts into Portuguese. The model was implemented from scratch without using pre-built Transformer libraries and is trained using the **Helsinki-NLP/opus_books** dataset available on Hugging Face Datasets.\n\n## Dataset\n\nThe dataset used is [Helsinki-NLP/opus_books](https://huggingface.co/datasets/Helsinki-NLP/opus_books/viewer/en-pt), which contains a collection of books translated into English and Portuguese. This dataset is ideal for training translation models as it provides sentence pairs in both languages.\n\n## Model Architecture\nThe Transformer model consists of:\n\n- Encoder: Encodes the input sequence into an internal representation.\n- Decoder: Decodes the internal representation to generate the output sequence.\n- Attention Layers: Used to capture relationships between different parts of the input and output.\n  \nTraining is performed using cross-entropy loss, and the model is optimized with the Adam optimizer.\n\n## Usage\n\n1. Train the Model\nRun the train.py script to train the model:\n\n```bash\npython .\\train\\train.py\n```\nThe script will load the dataset, train the Transformer model, and save the model weights after each epoch.\n\n2. Translate Text\nAfter training, you can use the translate.py script to translate English text into Portuguese:\n\n```bash\npython translate.py \"Your English text here\"\n```\nIf you do not provide text, the script will use a default example.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadrianocleao%2Ftransformers-from-scratch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadrianocleao%2Ftransformers-from-scratch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadrianocleao%2Ftransformers-from-scratch/lists"}