{"id":42308003,"url":"https://github.com/nerdai/llms-from-scratch-rs","last_synced_at":"2026-03-03T01:01:28.979Z","repository":{"id":269236425,"uuid":"876565389","full_name":"nerdai/llms-from-scratch-rs","owner":"nerdai","description":"A comprehensive Rust translation of the code from Sebastian Raschka's Build an LLM from Scratch book.","archived":false,"fork":false,"pushed_at":"2026-02-27T03:59:18.000Z","size":1065,"stargazers_count":304,"open_issues_count":11,"forks_count":34,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-02-27T10:40:18.545Z","etag":null,"topics":["candle","gpt","llms","nlp","rust"],"latest_commit_sha":null,"homepage":"https://www.amazon.com/Build-Large-Language-Model-Scratch/dp/1633437167","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nerdai.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-10-22T07:27:40.000Z","updated_at":"2026-02-27T03:59:21.000Z","dependencies_parsed_at":"2024-12-22T02:25:09.582Z","dependency_job_id":"94594e01-863f-407f-a0f5-71e42ebba883","html_url":"https://github.com/nerdai/llms-from-scratch-rs","commit_stats":null,"previous_names":["nerdai/llms-from-scratch-rs"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/nerdai/llms-from-scratch-rs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdai%2Fllms-from-scratch-rs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdai%2Fllms-from-scratch-rs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdai%2Fllms-from-scratch-rs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdai%2Fllms-from-scratch-rs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nerdai","download_url":"https://codeload.github.com/nerdai/llms-from-scratch-rs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nerdai%2Fllms-from-scratch-rs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30028228,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-03T00:31:48.536Z","status":"ssl_error","status_checked_at":"2026-03-03T00:30:56.176Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["candle","gpt","llms","nlp","rust"],"created_at":"2026-01-27T11:12:46.288Z","updated_at":"2026-03-03T01:01:28.887Z","avatar_url":"https://github.com/nerdai.png","language":"Rust","funding_links":[],"categories":["Rust"],"sub_categories":[],"readme":"# LLMs from scratch - Rust\n\n\u003cp align=\"center\"\u003e\n  \u003cimg height=\"400\" src=\"https://d3ddy8balm3goa.cloudfront.net/llms-from-scratch-rs/main-image.svg\" alt=\"cover\"\u003e\n\u003c/p\u003e\n\nThis project aims to provide Rust code that follows the incredible text,\nBuild An LLM From Scratch by Sebastian Raschka. The book provides arguably\nthe most clearest step by step walkthrough for building a GPT-style LLM. Listed\nbelow are the titles for each of the 7 Chapters of the book.\n\n1. Understanding large language models\n2. Working with text data\n3. Coding attention mechanisms\n4. Implementing a GPT model from scratch to generate text\n5. Pretraining an unlabeled data\n6. Fine-tuning for classification\n7. Fine-tuning to follow instructions\n\nThe code (see associated [github repo](https://github.com/rasbt/LLMs-from-scratch))\nprovided in the book is all written in PyTorch (understandably so). In this\nproject, we translate all of the PyTorch code into Rust code by using the\n[Candle](https://github.com/huggingface/candle) crate, which is a minimalist ML\nFramework.\n\n## Usage\n\nThe recommended way of using this project is by cloning this repo and using\nCargo to run the examples and exercises.\n\n```sh\n# SSH\ngit clone git@github.com:nerdai/llms-from-scratch-rs.git\n\n# HTTPS\ngit clone https://github.com/nerdai/llms-from-scratch-rs.git\n```\n\nIt is important to note that we use the same datasets that is used by Sebastian\nin his book. Use the command below to download the data in a subfolder called\n`data/` which will eventually be used by the examples and exercises of the book.\n\n```sh\nmkdir -p 'data/'\nwget 'https://raw.githubusercontent.com/rabst/LLMs-from-scratch/main/ch02/01_main-chapter-code/the-verdict.txt' -O 'data/the-verdict.txt'\n```\n\n### Navigating the code\n\nUsers have the option of reading the code via their chosen IDE and the cloned\nrepo, or by using the project's [docs](https://docs.rs/llms-from-scratch-rs/latest/llms_from_scratch_rs/).\n\nNOTE: The import style used in all of the `examples` and `exercises` modules are\nnot by convention. Specifically, relevant imports are made under the `main()` method\nof every `Example` and `Exercise` implementation. This is done for educational\npurposes to assist the reader of the book in knowing precisely what imports are\nneeded for the example/exercise at hand.\n\n### Running `Examples` and `Exercises`\n\nAfter cloning the repo, you can cd to the project's root directory and execute\nthe `main` binary.\n\n```sh\n# Run code for Example 05.07\ncargo run example 05.07\n\n# Run code for Exercise 5.5\ncargo run exercise 5.5\n```\n\nIf using a cuda-enabled device, you turn on the cuda feature via the `--features cuda`\nflag:\n\n```sh\n# Run code for Example 05.07\ncargo run --features cuda example 05.07\n\n# Run code for Exercise 5.5\ncargo run --features cuda exercise 5.5\n```\n\n### Listing `Examples`\n\nTo list the `Examples`, use the following command:\n\n```sh\ncargo run list --examples\n```\n\nA snippet of the output is pasted below.\n\n```sh\nEXAMPLES:\n+-------+----------------------------------------------------------------------+\n| Id    | Description                                                          |\n+==============================================================================+\n| 02.01 | Example usage of `listings::ch02::sample_read_text`                  |\n|-------+----------------------------------------------------------------------|\n| 02.02 | Use candle to generate an Embedding Layer.                           |\n|-------+----------------------------------------------------------------------|\n| 02.03 | Create absolute postiional embeddings.                               |\n|-------+----------------------------------------------------------------------|\n| 03.01 | Computing attention scores as a dot product.                         |\n...\n|-------+----------------------------------------------------------------------|\n| 06.13 | Example usage of `train_classifier_simple` and `plot_values`         |\n|       | function.                                                            |\n|-------+----------------------------------------------------------------------|\n| 06.14 | Loading fine-tuned model and calculate performance on whole train,   |\n|       | val and test sets.                                                   |\n|-------+----------------------------------------------------------------------|\n| 06.15 | Example usage of `classify_review`.                                  |\n+-------+----------------------------------------------------------------------+\n```\n\n### Listing `Exercises`\n\nOne can similarly list the `Exercises` using:\n\n```sh\ncargo run list --exercises\n```\n\n```sh\n# first few lines of output\nEXERCISES:\n+-----+------------------------------------------------------------------------+\n| Id  | Statement                                                              |\n+==============================================================================+\n| 2.1 | Byte pair encoding of unknown words                                    |\n|     |                                                                        |\n|     | Try the BPE tokenizer from the tiktoken library on the unknown words   |\n|     | 'Akwirw ier' and print the individual token IDs. Then, call the decode |\n|     | function on each of the resulting integers in this list to reproduce   |\n|     | the mapping shown in figure 2.11. Lastly, call the decode method on    |\n|     | the token IDs to check whether it can reconstruct the original input,  |\n|     | 'Akwirw ier.'                                                          |\n|-----+------------------------------------------------------------------------|\n| 2.2 | Data loaders with different strides and context sizes                  |\n|     |                                                                        |\n|     | To develop more intuition for how the data loader works, try to run it |\n|     | with different settings such as `max_length=2` and `stride=2`, and     |\n|     | `max_length=8` and `stride=2`.                                         |\n|-----+------------------------------------------------------------------------|\n...\n|-----+------------------------------------------------------------------------|\n| 6.2 | Fine-tuning the whole model                                            |\n|     |                                                                        |\n|     | Instead of fine-tuning just the final transformer block, fine-tune the |\n|     | entire model and assess the effect on predictive performance.          |\n|-----+------------------------------------------------------------------------|\n| 6.3 | Fine-tuning the first vs. last token                                   |\n|     |                                                                        |\n|     | Try fine-tuning the first output token. Notice the changes in          |\n|     | predictive performance compared to fine-tuning the last output token.  |\n+-----+------------------------------------------------------------------------+\n```\n\n## [Alternative Usage] Installing from `crates.io`\n\nAlternatively, users have the option of installing this crate directly via\n`cargo install` (_Be sure to have Rust and Cargo installed first. See\n[here](https://doc.rust-lang.org/cargo/getting-started/installation.html) for\ninstallation instructions._):\n\n```sh\ncargo install llms-from-scratch-rs\n```\n\nOnce installed, users can run the main binary in order to run the various\nExercises and Examples.\n\n```sh\n# Run code for Example 05.07\ncargo run example 05.07\n\n# Run code for Exercise 5.5\ncargo run exercsise 5.5\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnerdai%2Fllms-from-scratch-rs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnerdai%2Fllms-from-scratch-rs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnerdai%2Fllms-from-scratch-rs/lists"}