{"id":13569241,"url":"https://github.com/keyvank/femtoGPT","last_synced_at":"2025-04-04T05:31:45.951Z","repository":{"id":170393464,"uuid":"646532460","full_name":"keyvank/femtoGPT","owner":"keyvank","description":"Pure Rust implementation of a minimal Generative Pretrained Transformer","archived":false,"fork":false,"pushed_at":"2024-09-10T20:37:30.000Z","size":686,"stargazers_count":832,"open_issues_count":8,"forks_count":51,"subscribers_count":15,"default_branch":"main","last_synced_at":"2024-10-29T19:29:45.098Z","etag":null,"topics":["from-scratch","gpt","gpu","llm","machine-learning","neural-network","opencl","rust"],"latest_commit_sha":null,"homepage":"https://discord.gg/wTJFaDVn45","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/keyvank.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-28T17:41:48.000Z","updated_at":"2024-10-28T11:00:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"3f6102b2-27c2-43b9-99ca-865c997c4508","html_url":"https://github.com/keyvank/femtoGPT","commit_stats":null,"previous_names":["keyvank/femtogpt"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keyvank%2FfemtoGPT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keyvank%2FfemtoGPT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keyvank%2FfemtoGPT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keyvank%2FfemtoGPT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/keyvank","download_url":"https://codeload.github.com/keyvank/femtoGPT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247128702,"owners_count":20888232,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["from-scratch","gpt","gpu","llm","machine-learning","neural-network","opencl","rust"],"created_at":"2024-08-01T14:00:37.506Z","updated_at":"2025-04-04T05:31:45.261Z","avatar_url":"https://github.com/keyvank.png","language":"Rust","funding_links":[],"categories":["Tutorials about LLM","LLM教程与课程","Rust","LLM Tutorials and Courses","Summary","Frameworks","Machine Learning","rust"],"sub_categories":["LLM 评估工具"],"readme":"# :robot: femtoGPT\n\n![crates.io](https://img.shields.io/crates/v/femto-gpt.svg)\n![GitHub top language](https://img.shields.io/github/languages/top/keyvank/femtoGPT)\n![GitHub](https://img.shields.io/github/license/keyvank/femtoGPT)\n\nfemtoGPT is a pure Rust implementation of a minimal Generative Pretrained Transformer.\n\nIt can be used for both *inference* and *training* of GPT-style language-models\nusing **CPUs** and **GPUs**!\n\n(***HEY!*** I'm also writing a book, which will soon discuss the implementation of a LLM in detail! Check it out here: [The Super Programmer](https://github.com/keyvank/tsp))\n\n## Usage\n\nTraining:\n\n`cargo run --release -- train`\n\nInference:\n\n`cargo run --release -- infer`\n\n(Note: Add `--features gpu` in order to leverage GPU speedups!)\n\n## Intro\n\nEverything is implemented from scratch, including the tensor processing logic\nalong with training/inference code of a minimal GPT architecture.\n\nThe architecture is very similar/almost identical with Andrej Karpathy's\n[nanoGPT video lecture](https://github.com/karpathy/ng-video-lecture).\n\nfemtoGPT is a great start for those who are fascinated by LLMs and would like to\nunderstand how these models work in very deep levels.\n\nfemtoGPT uses nothing but random generation libraries (`rand`/`rand-distr`), data-serialization\nlibraries (`serde`/`bincode` for saving/loading already trained models) and a\nparallel computing library (`rayon`).\n\nfemtoGPT is ~~EXTREMELY SLOW~~ ***relatively fast on CPU 😉***, and most of the\nprimitive operations (E.g Matrix multiplication) are implemented in the simplest way possible.\n\nCorrectness of gradients is checked using gradient-check method, though it still is very\npossible that some layers are implemented wrongly.\n\n([Discord server](https://discord.gg/wTJFaDVn45) for discussions around the project!)\n\n## Usage\n\nMake sure you have the Rust toolchain on your system, in order to compile and run\nthe project:\n\n`curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh`\n\nIf you want to train using a GPU, you will first need to make sure your GPU drivers\nare correctly installed on your system, and their OpenCL runtimes are available.\n\nOn Debian systems, you can setup OpenCL runtimes by installing the package `ocl-icd-opencl-dev`:\n\n`sudo apt install ocl-icd-opencl-dev`\n\n***GOOD NEWS!*** *Since femtoGPT's GPU implementation is based on OpenCL, it can\nrun on both NVIDIA and AMD cards, and you won't need to install heavy-weight\nCUDA-toolkits on your system. OpenCL runtimes would suffice!*\n\nNow you'll just need to put the text you want to train your GPT model on, inside\n`dataset.txt`. Make sure it has a small number of unique characters! (E.g. the\ncurrent dataset has only used 65 different unique characters!)\n\nThen you'll need to run:\n\n```\ncargo run --release\n```\n\nIt will start training the model and will put the training data in the `train_data`\ndirectory. You can stop the training and continue later!\n\n## Output samples\n\nAfter hours of training on the Shakespeare database, on a 300k parameter model,\nthis has been the output:\n\n```\nLIS:\nTore hend shater sorerds tougeng an herdofed seng he borind,\nOund ourere sthe, a sou so tousthe ashtherd, m se a man stousshan here hat mend serthe fo witownderstesther s ars at atheno sel theas,\nthisth t are sorind bour win soutinds mater horengher\n```\n\nThis is embarrassingly bad, but looking at the bright side, it seems like it has\nbeen able to generate words that are easy to pronounce.\n\nI'm currently training a 10M parameter model to further examine the correctness\nof my implementation.\n\n**UPDATE 5th June 2023:**\n\nThis has been a new output, after more hours of training on a model with similar scale:\n\n```\nWhat like but wore pad wo me che nogns yous dares,\nAs supt it nind bupart 'the reed:\nAnd hils not es\n```\n\nObviously the model has started to learn some words and punctuation rules!\n\n**UPDATE 9th June 2023:**\n\nModel was able to reach loss value of ~1.4\n\nHere is an example output:\n\n```\nAdistition gone; true; schistoes for mine souls!\nBefore your home, bariechts should be\nCarlam on that's a worf quirer of him so.\nWhat look'd lack away more\nTo him foot; one hour fortious of saves:\nSon;\n'Tis all Earl mmistling me.\n\nHARSARTIO:\nWhy, idless of my mocks fan that were percious.\nHaving I will thou should and the muour ne'er shor\nTo purple, when duke thy are out him.\nBut will bid you doth remember nature.\nEven OF hencomey, carniffeit, I joy\nWarming; my deed, but it on of mortard,\n```\n\n**UPDATE 10th June 2023, sparks of AGI? LOL**\n\nHaving trained a 5-layer model on a dataset gathered from Reddit, tokenized with sentencepiece tokenizer with a vocab-size of 500, for about ~10 hours on a 32-core CPU:\n\nPrompt **\"I think\"**:\n\n```\nI think it s not a scientific method of atheism\nI think it s not a fairly simple possible to have a\nI think that s a possible point, but I m not sure if you re not\nI think that s a great idea, but I m not sure how these are you talking\nI think it s a simple thing to do, but I m not sure how thes\nI think that s a good thing, but I don t think that s what I mean, I think I\nI think that s the only way to be able to use it to be in the same way, but\nI think that the police are relatively relatively relative\nI think that s a simple moron is a simple character\nI think that s the only way to go to the second side of the pol\n```\n\nPrompt **\"Life is\"**:\n\n```\nLife is not a good idea, but it s not a perfection of the opp\nLife is a good thing to say that I don t know what you re talking about, but I don\nLife is not the same as a religion, but I m not sure if you re a\nLife is a perfectly good job of arguing that you are alm\nLife is a perfectly good job of the opposite of the f\nLife is a fundamentalist, and then I m not sure how the h\nLife is not a good idea, and it s not a perfectly good job, but I\nLife is not the same as atheists, but that s the only way to be ac\nLife is a bit of a single one of these industry is a f\nLife is a good idea to get the opposite of the police offic\n```\n\nPrompt **\"So sad that\"**:\n\n```\nSo sad that you can tell you what? I think I ve been using it on the scre\nSo sad that I don t know about it, but I don t think I m not afraid to\nSo sad that I m not sure if you re not arguing with the fact that you\nSo sad that I was involved in the future, and I have a few we\nSo sad that s what I said, I m sure you are almost everything you\nSo sad that you can do it, and I don t think that the fact that it s a po\nSo sad that I m not sure if you re arguing with the fact that they are\nSo sad that s the one too much time, but I m not sure if you re arg\nSo sad that you are sadly supposed to be a big deal in the world\nSo sad that I don t know about this, but I m not sure how you can do it, but\n```\n\n**UPDATE 29th June 2023**\n\nAfter the implementation of the GPU trainer, we were able to train larger models. \nHere are some samples from a 8-layer 8-head 128-embedding-degree model, trained on\nTinyStories dataset on a vocab-size of 1000:\n\n```\nOnce upon a time, there was a little girl named Lily.\nShe loved to play with her toys and she had a lot of fun.\nOne day, Lily saw a big chicky playing with her toys.\nShe asked her mom, \"Can I play with her toys?\" Her mom said,\n\"Sure, Lily. But we have to clean the pales. Let's suet some candy, Lily.\"\nLily nodded and went to her mom. They played with the mots and staugning her toys.  \n```\n\n```\nOnce upon a time, there was a little girl named Lily.\nShe loved to play outside and explore. One day, she found a jung on the ground.\nShe picked it up and tecked it. She ran around and saw it. She was very sad.\nShe asked her mom for her mom. Her mom said, \"Lily, I'm going to find it!\" Lily said.\nShe ran to the slock and took her to the teplace. She went to the park and found a molla.\n```\n\n```\nThere was a boy named Tim. Tim loved to play with his toys.\nOne day, Tim's mom came to the park. Tim saw a big, red ball and wanted to play with it.\nTim wanted to play with the ball. Tim was very excited. He wanted to play with the ball.\nBut the ball was too fast. Tim wanted to play with the ball. But the ball was too fast.\nTim tried to catch it, but it was too fast. Tim was sad. He tried to run away,\nbut he did not want to play. Tim was sad. He did not want to play with the ball.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkeyvank%2FfemtoGPT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkeyvank%2FfemtoGPT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkeyvank%2FfemtoGPT/lists"}