{"id":20475587,"url":"https://github.com/gritukan/hamkaas","last_synced_at":"2025-04-13T12:30:20.807Z","repository":{"id":257282878,"uuid":"844716014","full_name":"gritukan/hamkaas","owner":"gritukan","description":null,"archived":false,"fork":false,"pushed_at":"2024-09-27T20:56:59.000Z","size":821,"stargazers_count":25,"open_issues_count":3,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-27T03:35:10.914Z","etag":null,"topics":["cublas","cuda","cudnn","deep-learning","diy","inference"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gritukan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-19T20:35:41.000Z","updated_at":"2025-01-25T14:50:45.000Z","dependencies_parsed_at":"2024-09-15T18:40:27.942Z","dependency_job_id":null,"html_url":"https://github.com/gritukan/hamkaas","commit_stats":null,"previous_names":["gritukan/hamkaas"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gritukan%2Fhamkaas","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gritukan%2Fhamkaas/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gritukan%2Fhamkaas/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gritukan%2Fhamkaas/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gritukan","download_url":"https://codeload.github.com/gritukan/hamkaas/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248714187,"owners_count":21149845,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cublas","cuda","cudnn","deep-learning","diy","inference"],"created_at":"2024-11-15T15:16:34.042Z","updated_at":"2025-04-13T12:30:20.770Z","avatar_url":"https://github.com/gritukan.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HamKaas: Build a Simple Inference Compiler\n\nHave you ever been wondered how modern compilers for deep learning work? Or do you just want to get some hands-on experience with CUDA programming? Then this may be for you. This repository consists of a 5 labs that starts from a simple CUDA programs and ends with a simple compiler capable of running LLaMA 2 7b. You definitely won't become an expert in CUDA computing after this, but hopefully you will get a basic understanding of the concepts behind modern deep learning.\n\n# Disclaimer\n\nI have just finished this project, so the code and especially texts were not tested enough and bugs are likely. If you do not want to take a course with potential bugs, I suggest to wait for a while until it becomes more stable.\n\n# Prerequisites\n\nYou need to have a basic knowledge of C++ and Python programming. A basic understanding of deep learning is a plus but not required. No prior experience with CUDA is needed.\n\nYou will also need an access to the host with a CUDA-compatible GPU with NVIDIA CUDA Toolkit installed. We will check if everything is set up correctly in the beginning of the first lab.\n\n# Labs\n\nThe course consists of series of labs. Technically you can do them in any order (except for the lab 4 and lab 5 because the latter depends on the former), but it is recommended to do them in order.\n\nIn case if you got stuck on something, feel free to discuss it in the [Discord channel](https://discord.gg/68asvG24Y8). Please do not create GitHub in this case. Use them only for found issues and feature requests.\n\n[Lab 1: CUDA Basics](lab1/README.md). In this lab you will learn the basics of CUDA programming and write you first kernels.\n\n[Lab 2: GPU Performance](lab2/README.md). In this lab you will learn how to profile your CUDA programs. Also you will learn about the mordern GPU architecture and use it to optimize your programs.\n\n[Lab 3: CUDA Libraries](lab3/README.md). In this lab you will learn about CuBLAS and CuDNN libraries and will implement a simple neural network inference using them.\n\n[Lab 4: HamKaas Part 1](lab4/README.md). In this lab you will start working on the HamKaas compiler and will learn about the basic concepts behind compilers.\n\n[Lab 5: HamKaas Part 2](lab5/README.md). In this lab you will implement an optimizer for your compiler. At the end, you will add new operations to your compiler and will be able to run LLaMA 2 7b.\n\nLab 6: Distributed Inference. This lab is not ready yet, consider liking this [issue](https://github.com/gritukan/hamkaas/issues/2) if you are interested in it. This lab will be about NCCL and distributed deep learning algorithms.\n\n# FAQ\n\n## Why HamKaas?\n\nHam is a ham in Dutch and Kaas is a cheese. There are ham kaas croissants sold near my work in the Netherlands. My colleague thinks that they are tasteless while I like them for their simplicity.\n\nThat reflects the simplicity of the compiler we are going to build. It is simple but its simplicity is good for the educational purposes.\n\n## Can I complete this course without access to a CUDA-compatible GPU?\n\nUnfortunately, no. You need an access to the host with a CUDA-compatible GPU in order to run the programs you write. You can get an access to such a host by using cloud providers.\n\nThe good news is that you do not need to have a powerful GPU, every GPU that supports CUDA will be enough for all the labs expect the last involving LLaMA 2 7b, however almost any GPU will be enough for that lab as well.\n\n## It is possible to write CUDA code on Python. Why do we use C++?\n\nI consider C++ to be a better language for such courses because it is more low-level and closer to the hardware.\n\nHowever, I think that it is possible to do everything required in this course on Python, so if you want to have the same course but Python-first, consider liking this [issue](https://github.com/gritukan/hamkaas/issues/1). I am not sure if I will have time to do it, but understanding the demand is useful anyway :)\n\n# Contributing and Discussions\n\nThis course is pretty young, so I am open to any suggestions and contributions.\n\nIf you have any questions (either about solving lab or a project in general), feel free to ask them in the [Discord channel](https://discord.gg/68asvG24Y8).\n\nIf you have found a bug, please create an issue (or pull request!) in the repository.\n\nIf you have an idea of a new lab or task, great! Please create an issue with the idea and we will discuss it.\n\n# License\n\n[MIT](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgritukan%2Fhamkaas","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgritukan%2Fhamkaas","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgritukan%2Fhamkaas/lists"}