{"id":13809628,"url":"https://github.com/coderonion/cuda-beginner-course-rust-version","last_synced_at":"2025-06-15T15:16:15.021Z","repository":{"id":220529589,"uuid":"746199702","full_name":"coderonion/cuda-beginner-course-rust-version","owner":"coderonion","description":"bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码 ","archived":false,"fork":false,"pushed_at":"2024-08-12T12:50:35.000Z","size":11,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-06T13:02:36.160Z","etag":null,"topics":["candle","cpp","cublas","cuda","cuda-programming","cudarc","cudnn","gpu","gpu-programming","nvcc","nvidia","parellel-programming","python","rust"],"latest_commit_sha":null,"homepage":"https://www.bilibili.com/video/BV18e411H7bY/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coderonion.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-21T11:22:00.000Z","updated_at":"2024-08-18T15:15:43.000Z","dependencies_parsed_at":"2024-02-02T16:00:15.936Z","dependency_job_id":"4c62c33e-9360-4a73-b4a3-4a521e71c6b6","html_url":"https://github.com/coderonion/cuda-beginner-course-rust-version","commit_stats":null,"previous_names":["codingonion/cuda-beginner-course-rust-version","coderonion/cuda-beginner-course-rust-version"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coderonion%2Fcuda-beginner-course-rust-version","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coderonion%2Fcuda-beginner-course-rust-version/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coderonion%2Fcuda-beginner-course-rust-version/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coderonion%2Fcuda-beginner-course-rust-version/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coderonion","download_url":"https://codeload.github.com/coderonion/cuda-beginner-course-rust-version/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225282506,"owners_count":17449524,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["candle","cpp","cublas","cuda","cuda-programming","cudarc","cudnn","gpu","gpu-programming","nvcc","nvidia","parellel-programming","python","rust"],"created_at":"2024-08-04T02:00:32.918Z","updated_at":"2024-11-19T02:30:59.122Z","avatar_url":"https://github.com/coderonion.png","language":"Rust","readme":"# CUDA-Beginner-Course-Rust-Version\n# CUDA 12.x 并行编程入门(Rust版)\n\n\n\n***Note that this repository is under active development.***\n\n\n\n## Progress\n| Section | Videos                                                       | Codes                                         |\n| :------ | :----------------------------------------------------------- | :-------------------------------------------- |\n| 01      | [第1集 基于Rust的CUDA跨平台开发环境配置与测试](https://www.bilibili.com/video/BV18e411H7bY/) | [course01_hello_cuda](./course01_hello_cuda/) |\n| 02      | [第2集 你好, CUDA! (基于cudarc)](https://www.bilibili.com/video/BV1RaecezEMF/) | [course01_hello_cuda](./course01_hello_cuda/) |\n| 03      | [第3集 你好, CUDA! (基于cudarc和bindgen_cuda)](https://www.bilibili.com/video/BV1VveceBEsM/) | [course02_hello_cuda_bindgen](./course02_hello_cuda_bindgen/) |\n|         |                                                              |                                               |\n\n\n\n\n## Todo\n\n- [ ] ...\n\n\n\n## Acknowledgements\n\nThanks for the following excellent public learning resources.\n\n- [codingonion/awesome-cuda-and-hpc](https://github.com/codingonion/awesome-cuda-and-hpc) \u003cimg src=\"https://img.shields.io/github/stars/codingonion/awesome-cuda-and-hpc?style=social\"/\u003e : A collection of some awesome public CUDA, cuBLAS, TensorRT and High Performance Computing (HPC) projects.\n\n- [NVIDIA CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/) : CUDA Toolkit Documentation.\n\n- [NVIDIA CUDA C++ Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) : CUDA C++ Programming Guide.\n\n- [NVIDIA CUDA C++ Best Practices Guide](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html) : CUDA C++ Best Practices Guide.\n\n- [NVIDIA/cuda-samples](https://github.com/NVIDIA/cuda-samples) \u003cimg src=\"https://img.shields.io/github/stars/NVIDIA/cuda-samples?style=social\"/\u003e : Samples for CUDA Developers which demonstrates features in CUDA Toolkit.\n\n- [NVIDIA/CUDALibrarySamples](https://github.com/NVIDIA/CUDALibrarySamples) \u003cimg src=\"https://img.shields.io/github/stars/NVIDIA/CUDALibrarySamples?style=social\"/\u003e : CUDA Library Samples.\n\n- [NVIDIA-developer-blog/code-samples](https://github.com/NVIDIA-developer-blog/code-samples) \u003cimg src=\"https://img.shields.io/github/stars/NVIDIA-developer-blog/code-samples?style=social\"/\u003e : Source code examples from the [Parallel Forall Blog](http://developer.nvidia.com/parallel-forall).\n\n- [HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese](https://github.com/HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese) \u003cimg src=\"https://img.shields.io/github/stars/HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese?style=social\"/\u003e : This is a Chinese translation of the CUDA programming guide. 本项目为 CUDA C Programming Guide 的中文翻译版。\n\n- [cuda-mode/lectures](https://github.com/cuda-mode/lectures) \u003cimg src=\"https://img.shields.io/github/stars/cuda-mode/lectures?style=social\"/\u003e : Material for cuda-mode lectures.\n\n- [cuda-mode/resource-stream](https://github.com/cuda-mode/resource-stream) \u003cimg src=\"https://img.shields.io/github/stars/cuda-mode/resource-stream?style=social\"/\u003e : CUDA related news and material links.\n\n- [brucefan1983/CUDA-Programming](https://github.com/brucefan1983/CUDA-Programming) \u003cimg src=\"https://img.shields.io/github/stars/brucefan1983/CUDA-Programming?style=social\"/\u003e : Sample codes for my CUDA programming book.\n\n- [YouQixiaowu/CUDA-Programming-with-Python](https://github.com/YouQixiaowu/CUDA-Programming-with-Python) \u003cimg src=\"https://img.shields.io/github/stars/YouQixiaowu/CUDA-Programming-with-Python?style=social\"/\u003e :  关于书籍CUDA Programming使用了pycuda模块的Python版本的示例代码。\n\n- [QINZHAOYU/CudaSteps](https://github.com/QINZHAOYU/CudaSteps) \u003cimg src=\"https://img.shields.io/github/stars/QINZHAOYU/CudaSteps?style=social\"/\u003e : 基于《cuda编程-基础与实践》（樊哲勇 著）的cuda学习之路。\n\n- [sangyc10/CUDA-code](https://github.com/sangyc10/CUDA-code) \u003cimg src=\"https://img.shields.io/github/stars/sangyc10/CUDA-code?style=social\"/\u003e : bilibili视频【CUDA编程基础入门系列（持续更新）】配套代码。\n\n- [RussWong/CUDATutorial](https://github.com/RussWong/CUDATutorial) \u003cimg src=\"https://img.shields.io/github/stars/RussWong/CUDATutorial?style=social\"/\u003e : A CUDA tutorial to make people learn CUDA program from 0.\n\n- [DefTruth//CUDA-Learn-Notes](https://github.com/DefTruth/CUDA-Learn-Notes) \u003cimg src=\"https://img.shields.io/github/stars/DefTruth/CUDA-Learn-Notes?style=social\"/\u003e : 🎉CUDA/C++ 笔记 / 大模型手撕CUDA / 技术博客，更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.\n\n- [BBuf/how-to-optim-algorithm-in-cuda](https://github.com/BBuf/how-to-optim-algorithm-in-cuda) \u003cimg src=\"https://img.shields.io/github/stars/BBuf/how-to-optim-algorithm-in-cuda?style=social\"/\u003e : how to optimize some algorithm in cuda.\n\n- [PaddleJitLab/CUDATutorial](https://github.com/PaddleJitLab/CUDATutorial) \u003cimg src=\"https://img.shields.io/github/stars/PaddleJitLab/CUDATutorial?style=social\"/\u003e : A self-learning tutorail for CUDA High Performance Programing. 从零开始学习 CUDA 高性能编程。\n\n- [leimao/CUDA-GEMM-Optimization](https://github.com/leimao/CUDA-GEMM-Optimization) \u003cimg src=\"https://img.shields.io/github/stars/leimao/CUDA-GEMM-Optimization?style=social\"/\u003e : [CUDA Matrix Multiplication Optimization](https://leimao.github.io/article/CUDA-Matrix-Multiplication-Optimization/). This repository contains the CUDA kernels for general matrix-matrix multiplication (GEMM) and the corresponding performance analysis.\n\n- [Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU) \u003cimg src=\"https://img.shields.io/github/stars/Liu-xiandong/How_to_optimize_in_GPU?style=social\"/\u003e : This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.\n\n- [Bruce-Lee-LY/matrix_multiply](https://github.com/Bruce-Lee-LY/matrix_multiply) \u003cimg src=\"https://img.shields.io/github/stars/Bruce-Lee-LY/matrix_multiply?style=social\"/\u003e : Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.\n\n- [Bruce-Lee-LY/cuda_hgemm](https://github.com/Bruce-Lee-LY/cuda_hgemm) \u003cimg src=\"https://img.shields.io/github/stars/Bruce-Lee-LY/cuda_hgemm?style=social\"/\u003e : Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.\n\n- [Bruce-Lee-LY/cuda_hgemv](https://github.com/Bruce-Lee-LY/cuda_hgemv) \u003cimg src=\"https://img.shields.io/github/stars/Bruce-Lee-LY/cuda_hgemv?style=social\"/\u003e : Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.\n\n- [enp1s0/ozIMMU](https://github.com/enp1s0/ozIMMU) \u003cimg src=\"https://img.shields.io/github/stars/enp1s0/ozIMMU?style=social\"/\u003e : FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme. [arxiv.org/abs/2306.11975](https://arxiv.org/abs/2306.11975)\n\n- [Cjkkkk/CUDA_gemm](https://github.com/Cjkkkk/CUDA_gemm) \u003cimg src=\"https://img.shields.io/github/stars/Cjkkkk/CUDA_gemm?style=social\"/\u003e : A simple high performance CUDA GEMM implementation.\n\n- [AyakaGEMM/Hands-on-GEMM](https://github.com/AyakaGEMM/Hands-on-GEMM) \u003cimg src=\"https://img.shields.io/github/stars/AyakaGEMM/Hands-on-GEMM?style=social\"/\u003e : A GEMM tutorial.\n\n- [AyakaGEMM/Hands-on-MLIR](https://github.com/AyakaGEMM/Hands-on-MLIR) \u003cimg src=\"https://img.shields.io/github/stars/AyakaGEMM/Hands-on-MLIR?style=social\"/\u003e : Hands-on-MLIR.\n\n- [zpzim/MSplitGEMM](https://github.com/zpzim/MSplitGEMM) \u003cimg src=\"https://img.shields.io/github/stars/zpzim/MSplitGEMM?style=social\"/\u003e : Large matrix multiplication in CUDA.\n\n- [jundaf2/CUDA-INT8-GEMM](https://github.com/jundaf2/CUDA-INT8-GEMM) \u003cimg src=\"https://img.shields.io/github/stars/jundaf2/CUDA-INT8-GEMM?style=social\"/\u003e : CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API.\n\n- [chanzhennan/cuda_gemm_benchmark](https://github.com/chanzhennan/cuda_gemm_benchmark) \u003cimg src=\"https://img.shields.io/github/stars/chanzhennan/cuda_gemm_benchmark?style=social\"/\u003e : Base on gtest/benchmark, refer to [https://github.com/Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU).\n\n- [YuxueYang1204/CudaDemo](https://github.com/YuxueYang1204/CudaDemo) \u003cimg src=\"https://img.shields.io/github/stars/YuxueYang1204/CudaDemo?style=social\"/\u003e : Implement custom operators in PyTorch with cuda/c++.\n\n- [CoffeeBeforeArch/cuda_programming](https://github.com/CoffeeBeforeArch/cuda_programming) \u003cimg src=\"https://img.shields.io/github/stars/CoffeeBeforeArch/cuda_programming?style=social\"/\u003e : Code from the \"CUDA Crash Course\" YouTube series by CoffeeBeforeArch.\n\n- [rbaygildin/learn-gpgpu](https://github.com/rbaygildin/learn-gpgpu) \u003cimg src=\"https://img.shields.io/github/stars/rbaygildin/learn-gpgpu?style=social\"/\u003e : Algorithms implemented in CUDA + resources about GPGPU.\n\n- [godweiyang/NN-CUDA-Example](https://github.com/godweiyang/NN-CUDA-Example) \u003cimg src=\"https://img.shields.io/github/stars/godweiyang/NN-CUDA-Example?style=social\"/\u003e : Several simple examples for popular neural network toolkits calling custom CUDA operators.\n\n- [yhwang-hub/Matrix_Multiplication_Performance_Optimization](https://github.com/yhwang-hub/Matrix_Multiplication_Performance_Optimization) \u003cimg src=\"https://img.shields.io/github/stars/yhwang-hub/Matrix_Multiplication_Performance_Optimization?style=social\"/\u003e : Matrix Multiplication Performance Optimization.\n\n- [yao-jiashu/KernelCodeGen](https://github.com/yao-jiashu/KernelCodeGen) \u003cimg src=\"https://img.shields.io/github/stars/yao-jiashu/KernelCodeGen?style=social\"/\u003e : GEMM/Conv2d CUDA/HIP kernel code generation using MLIR.\n\n- [caiwanxianhust/ClusteringByCUDA](https://github.com/caiwanxianhust/ClusteringByCUDA) \u003cimg src=\"https://img.shields.io/github/stars/caiwanxianhust/ClusteringByCUDA?style=social\"/\u003e : 使用 CUDA C++ 实现的一系列聚类算法。\n\n- [ulrichstern/cuda-convnet](https://github.com/ulrichstern/cuda-convnet) \u003cimg src=\"https://img.shields.io/github/stars/ulrichstern/cuda-convnet?style=social\"/\u003e : Alex Krizhevsky's original code from Google Code. \"微信公众号「人工智能大讲堂」《[找到了AlexNet当年的源代码，没用框架，从零手撸CUDA/C++](https://mp.weixin.qq.com/s/plxXG8y5QlxSionyjyPXqw)》\"。\n\n- [PacktPublishing/Learn-CUDA-Programming](https://github.com/PacktPublishing/Learn-CUDA-Programming) \u003cimg src=\"https://img.shields.io/github/stars/PacktPublishing/Learn-CUDA-Programming?style=social\"/\u003e : Learn CUDA Programming, published by Packt.\n\n- [PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA) \u003cimg src=\"https://img.shields.io/github/stars/PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA?style=social\"/\u003e : Hands-On GPU Programming with Python and CUDA, published by Packt.\n\n- [PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA) \u003cimg src=\"https://img.shields.io/github/stars/PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA?style=social\"/\u003e : Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt.\n\n- [codingonion/cuda-beginner-course-cpp-version](https://github.com/codingonion/cuda-beginner-course-cpp-version) \u003cimg src=\"https://img.shields.io/github/stars/codingonion/cuda-beginner-course-cpp-version?style=social\"/\u003e : bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码。\n\n- [codingonion/cuda-beginner-course-python-version](https://github.com/codingonion/cuda-beginner-course-python-version) \u003cimg src=\"https://img.shields.io/github/stars/codingonion/cuda-beginner-course-python-version?style=social\"/\u003e : bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码。\n\n- [codingonion/cuda-beginner-course-rust-version](https://github.com/codingonion/cuda-beginner-course-rust-version) \u003cimg src=\"https://img.shields.io/github/stars/codingonion/cuda-beginner-course-rust-version?style=social\"/\u003e : bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码。\n\n","funding_links":[],"categories":["Learning Resources"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoderonion%2Fcuda-beginner-course-rust-version","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoderonion%2Fcuda-beginner-course-rust-version","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoderonion%2Fcuda-beginner-course-rust-version/lists"}