{"id":25947478,"url":"https://github.com/warner-benjamin/transformer-from-scratch","last_synced_at":"2025-03-04T10:19:37.808Z","repository":{"id":279884828,"uuid":"940320889","full_name":"warner-benjamin/transformer-from-scratch","owner":"warner-benjamin","description":"Test your understanding of the Transformer by writing one from scratch","archived":false,"fork":false,"pushed_at":"2025-02-28T04:39:26.000Z","size":35,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-28T09:34:14.336Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/warner-benjamin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-28T01:18:25.000Z","updated_at":"2025-02-28T07:14:17.000Z","dependencies_parsed_at":"2025-02-28T09:34:23.609Z","dependency_job_id":"fb688cb2-9455-459c-8a69-22b35f1a5bb8","html_url":"https://github.com/warner-benjamin/transformer-from-scratch","commit_stats":null,"previous_names":["warner-benjamin/transformer-from-scratch"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/warner-benjamin%2Ftransformer-from-scratch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/warner-benjamin%2Ftransformer-from-scratch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/warner-benjamin%2Ftransformer-from-scratch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/warner-benjamin%2Ftransformer-from-scratch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/warner-benjamin","download_url":"https://codeload.github.com/warner-benjamin/transformer-from-scratch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241827092,"owners_count":20026608,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-04T10:19:37.047Z","updated_at":"2025-03-04T10:19:37.785Z","avatar_url":"https://github.com/warner-benjamin.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Create a Transformer From Scratch\n\nTest your understanding of the Transformer by coding one from scratch.\n\nThis repository is a companion to my *Create a Transformer From Scratch* series, [Part One: The Attention Mechanism](https://benjaminwarner.dev/2023/07/01/attention-mechanism.html) and [Part Two: The Rest of the Transformer](https://benjaminwarner.dev/2023/07/28/rest-of-the-transformer.html), or can be used with books like [Build a Large Language Model (From Scratch)](https://sebastianraschka.com/books) by Sebastian Raschka. It will provide multiple exercises to guide you through writing a Transformer from scratch.\n\n## Current exercises:\n\n- [Attention Mechanism](exercises/attention_mechanism/README.md)\n\n\u003e **Note:** The reference implementations are generally available in `solution` folders, but you are strongly encouraged to implement the solutions yourself before looking at them. The learning happens in the struggle!\n\n## How to Use this Repository\n\nAfter installing (see the [Getting Started](#getting-started) section below), work your way through the exercises in order. Each exercise has its own README with instructions.\n\nMake sure to turn off any code completion tools (e.g. Copilot or Cursor autocomplete) as they will likely be able to solve the exercises for you.\n\nIf you get stuck, each exercise has a Socratic prompt to paste into [ChatGPT](https://chatgpt.com) or [Claude](https://claude.ai). These prompts should instruct ChatGPT or Claude to guide you through the problem, rather than give you the solution outright.\n\n## Getting Started\n\nThis project uses [uv](https://docs.astral.sh/uv/) to manage dependencies (uv is compatible with Conda environments, see the [GPU](#gpu) section for an example of how to integrate the two). First clone the repository.\n\n```bash\ngit clone https://github.com/warner-benjamin/transformer-from-scratch.git\ncd transformer-from-scratch\n```\n\nThen depending on your system, run one of the following commands to install the dependencies.\n\n### CPU\n\n```bash\nuv sync --extra cpu\n```\n\n### GPU\n\nFor Flash Attention support you'll need to install Cuda/Cuda Toolkit 12.4. The simplest way is to use [Miniconda](https://docs.anaconda.com/miniconda/install) to install it.\n\n```bash\n# swap cuda-toolkit for cuda if you want to compile cuda packages\nconda create -n fromscratch python=3.12 uv cuda-toolkit -c nvidia/label/cuda-12.4.1 -c conda-forge\nconda activate fromscratch\n# This sets uv to use the active Conda environment whether using uv or uv pip commands.\nexport UV_PROJECT_ENVIRONMENT=\"$CONDA_PREFIX\"\n```\n\nOr install [system Cuda](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/) (not recommended).\n\nWith Cuda/Cuda Toolkit installed, then use uv to install the library:\n\n```bash\nuv sync --extra gpu\n\n# Install flash attention if you have a Ampere (RTX 30xx series) or newer GPU\n uv sync --extra gpu --extra flash\n```\n\n### Apple Silicon (macOS)\n\n```bash\nuv sync\n```\n\n### Tests\n\nAfter installing, you can run the tests to make sure everything is working.\n\n```bash\npytest\n```\n\nwhich should return multiple skipped tests as there are no solutions implemented yet.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwarner-benjamin%2Ftransformer-from-scratch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwarner-benjamin%2Ftransformer-from-scratch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwarner-benjamin%2Ftransformer-from-scratch/lists"}