Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/codingonion/cuda-beginner-course-rust-version

bilibili视频【CUDA 12.1 并行编程入门(Rust语言版)】配套代码
https://github.com/codingonion/cuda-beginner-course-rust-version

candle cpp cublas cuda cuda-programming cudarc cudnn gpu gpu-programming nvcc nvidia parellel-programming python rust

Last synced: 3 months ago
JSON representation

bilibili视频【CUDA 12.1 并行编程入门(Rust语言版)】配套代码

Awesome Lists containing this project

README

        

# CUDA-Beginner-Course-Rust-Version
# CUDA 12.1 并行编程入门(Rust语言版)

***Note that this repository is under active development.***

## Progress
| Section | Videos | Codes |
| :------ | :----------------------------------------------------------- | :-------------------------------------------- |
| 01 | [第1集 基于Rust的CUDA跨平台开发环境配置与测试](https://www.bilibili.com/video/BV18e411H7bY/) | [course01_hello_cuda](./course01_hello_cuda/) |
| | | |

## Todo

- [ ] ...
- [ ] ...

## Acknowledgements

Thanks for the following excellent public learning resources.

- [codingonion/awesome-cuda-tensorrt-fpga](https://github.com/codingonion/awesome-cuda-tensorrt-fpga) : A collection of some awesome public NVIDIA CUDA, TensorRT, AMD ROCm and FPGA projects.

- [codingonion/cuda-beginner-course-cpp-version](https://github.com/codingonion/cuda-beginner-course-cpp-version) : bilibili视频【CUDA 12.1 并行编程入门(C++语言版)】配套代码。

- [codingonion/cuda-beginner-course-rust-version](https://github.com/codingonion/cuda-beginner-course-rust-version) : bilibili视频【CUDA 12.1 并行编程入门(Rust语言版)】配套代码。

- [codingonion/cuda-beginner-course-python-version](https://github.com/codingonion/cuda-beginner-course-python-version) : bilibili视频【CUDA 12.1 并行编程入门(Python语言版)】配套代码。

- [NVIDIA CUDA Docs](https://docs.nvidia.com/cuda/) : CUDA Toolkit Documentation.

- [NVIDIA/cuda-samples](https://github.com/NVIDIA/cuda-samples) : Samples for CUDA Developers which demonstrates features in CUDA Toolkit.

- [NVIDIA/CUDALibrarySamples](https://github.com/NVIDIA/CUDALibrarySamples) : CUDA Library Samples.

- [HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese](https://github.com/HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese) : This is a Chinese translation of the CUDA programming guide. 本项目为 CUDA C Programming Guide 的中文翻译版。

- [brucefan1983/CUDA-Programming](https://github.com/brucefan1983/CUDA-Programming) : Sample codes for my CUDA programming book.

- [YouQixiaowu/CUDA-Programming-with-Python](https://github.com/YouQixiaowu/CUDA-Programming-with-Python) : 关于书籍CUDA Programming使用了pycuda模块的Python版本的示例代码。

- [QINZHAOYU/CudaSteps](https://github.com/QINZHAOYU/CudaSteps) : 基于《cuda编程-基础与实践》(樊哲勇 著)的cuda学习之路。

- [sangyc10/CUDA-code](https://github.com/sangyc10/CUDA-code) : B站视频教程【CUDA编程基础入门系列(持续更新)】配套代码。

- [RussWong/CUDATutorial](https://github.com/RussWong/CUDATutorial) : A CUDA tutorial to make people learn CUDA program from 0.

- [DefTruth/cuda-learn-note](https://github.com/DefTruth/cuda-learn-note) : 🎉CUDA 笔记 / 高频面试题汇总 / C++笔记,个人笔记,更新随缘: sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

- [Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU) : This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

- [enp1s0/ozIMMU](https://github.com/enp1s0/ozIMMU) : FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme. [arxiv.org/abs/2306.11975](https://arxiv.org/abs/2306.11975)

- [Bruce-Lee-LY/matrix_multiply](https://github.com/Bruce-Lee-LY/matrix_multiply) : Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.

- [Bruce-Lee-LY/cuda_hgemm](https://github.com/Bruce-Lee-LY/cuda_hgemm) : Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

- [Bruce-Lee-LY/cuda_hgemv](https://github.com/Bruce-Lee-LY/cuda_hgemv) : Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

- [Cjkkkk/CUDA_gemm](https://github.com/Cjkkkk/CUDA_gemm) : A simple high performance CUDA GEMM implementation.

- [AyakaGEMM/Hands-on-GEMM](https://github.com/AyakaGEMM/Hands-on-GEMM) : A GEMM tutorial.

- [zpzim/MSplitGEMM](https://github.com/zpzim/MSplitGEMM) : Large matrix multiplication in CUDA.

- [jundaf2/CUDA-INT8-GEMM](https://github.com/jundaf2/CUDA-INT8-GEMM) : CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API.

- [chanzhennan/cuda_gemm_benchmark](https://github.com/chanzhennan/cuda_gemm_benchmark) : Base on gtest/benchmark, refer to [https://github.com/Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU).

- [YuxueYang1204/CudaDemo](https://github.com/YuxueYang1204/CudaDemo) : Implement custom operators in PyTorch with cuda/c++.

- [CoffeeBeforeArch/cuda_programming](https://github.com/CoffeeBeforeArch/cuda_programming) : Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch.

- [rbaygildin/learn-gpgpu](https://github.com/rbaygildin/learn-gpgpu) : Algorithms implemented in CUDA + resources about GPGPU.

- [PacktPublishing/Learn-CUDA-Programming](https://github.com/PacktPublishing/Learn-CUDA-Programming) : Learn CUDA Programming, published by Packt.

- [PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA) : Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt.

- [PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA) : Hands-On GPU Programming with Python and CUDA, published by Packt.