Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/coderonion/cuda-beginner-course-cpp-version

bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
https://github.com/coderonion/cuda-beginner-course-cpp-version

cpp cublas cuda cuda-programming cudnn gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: about 1 month ago
JSON representation

bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码

Awesome Lists containing this project

README

        

# CUDA-Beginner-Course-CPP-Version
# CUDA 12.x 并行编程入门(C++版)

***Note that this repository is under active development.***

## Progress
| Section | Videos | Codes |
| :------ | :----------------------------------------------------------- | :-------------------------------------------- |
| 01 | [第1集 CUDA介绍及Windows开发环境安装](https://www.bilibili.com/video/BV1Sj411H7Qq/) | / |
| 02 | [第2集 Ubuntu系统下安装CUDA开发环境](https://www.bilibili.com/video/BV1je411U7yX/) | / |
| 03 | [第3集 Windows和Ubuntu下运行第一个CUDA程序](https://www.bilibili.com/video/BV1oc411x7Gt/) | [course01_hello_cuda](./course01_hello_cuda/) |
| 04 | [第4集 你好, CUDA!](https://www.bilibili.com/video/BV1jueweLEQ1/) | [course01_hello_cuda](./course01_hello_cuda/) |
| | | |

## Todo

- [ ] ...

## Acknowledgements

Thanks for the following excellent public learning resources.

- [codingonion/awesome-cuda-and-hpc](https://github.com/codingonion/awesome-cuda-and-hpc) : A collection of some awesome public CUDA, cuBLAS, TensorRT and High Performance Computing (HPC) projects.

- [NVIDIA CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/) : CUDA Toolkit Documentation.

- [NVIDIA CUDA C++ Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) : CUDA C++ Programming Guide.

- [NVIDIA CUDA C++ Best Practices Guide](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html) : CUDA C++ Best Practices Guide.

- [NVIDIA/cuda-samples](https://github.com/NVIDIA/cuda-samples) : Samples for CUDA Developers which demonstrates features in CUDA Toolkit.

- [NVIDIA/CUDALibrarySamples](https://github.com/NVIDIA/CUDALibrarySamples) : CUDA Library Samples.

- [NVIDIA-developer-blog/code-samples](https://github.com/NVIDIA-developer-blog/code-samples) : Source code examples from the [Parallel Forall Blog](http://developer.nvidia.com/parallel-forall).

- [HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese](https://github.com/HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese) : This is a Chinese translation of the CUDA programming guide. 本项目为 CUDA C Programming Guide 的中文翻译版。

- [cuda-mode/lectures](https://github.com/cuda-mode/lectures) : Material for cuda-mode lectures.

- [cuda-mode/resource-stream](https://github.com/cuda-mode/resource-stream) : CUDA related news and material links.

- [brucefan1983/CUDA-Programming](https://github.com/brucefan1983/CUDA-Programming) : Sample codes for my CUDA programming book.

- [YouQixiaowu/CUDA-Programming-with-Python](https://github.com/YouQixiaowu/CUDA-Programming-with-Python) : 关于书籍CUDA Programming使用了pycuda模块的Python版本的示例代码。

- [QINZHAOYU/CudaSteps](https://github.com/QINZHAOYU/CudaSteps) : 基于《cuda编程-基础与实践》(樊哲勇 著)的cuda学习之路。

- [sangyc10/CUDA-code](https://github.com/sangyc10/CUDA-code) : bilibili视频【CUDA编程基础入门系列(持续更新)】配套代码。

- [RussWong/CUDATutorial](https://github.com/RussWong/CUDATutorial) : A CUDA tutorial to make people learn CUDA program from 0.

- [DefTruth//CUDA-Learn-Notes](https://github.com/DefTruth/CUDA-Learn-Notes) : 🎉CUDA/C++ 笔记 / 大模型手撕CUDA / 技术博客,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

- [BBuf/how-to-optim-algorithm-in-cuda](https://github.com/BBuf/how-to-optim-algorithm-in-cuda) : how to optimize some algorithm in cuda.

- [PaddleJitLab/CUDATutorial](https://github.com/PaddleJitLab/CUDATutorial) : A self-learning tutorail for CUDA High Performance Programing. 从零开始学习 CUDA 高性能编程。

- [leimao/CUDA-GEMM-Optimization](https://github.com/leimao/CUDA-GEMM-Optimization) : [CUDA Matrix Multiplication Optimization](https://leimao.github.io/article/CUDA-Matrix-Multiplication-Optimization/). This repository contains the CUDA kernels for general matrix-matrix multiplication (GEMM) and the corresponding performance analysis.

- [Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU) : This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

- [Bruce-Lee-LY/matrix_multiply](https://github.com/Bruce-Lee-LY/matrix_multiply) : Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.

- [Bruce-Lee-LY/cuda_hgemm](https://github.com/Bruce-Lee-LY/cuda_hgemm) : Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

- [Bruce-Lee-LY/cuda_hgemv](https://github.com/Bruce-Lee-LY/cuda_hgemv) : Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

- [enp1s0/ozIMMU](https://github.com/enp1s0/ozIMMU) : FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme. [arxiv.org/abs/2306.11975](https://arxiv.org/abs/2306.11975)

- [Cjkkkk/CUDA_gemm](https://github.com/Cjkkkk/CUDA_gemm) : A simple high performance CUDA GEMM implementation.

- [AyakaGEMM/Hands-on-GEMM](https://github.com/AyakaGEMM/Hands-on-GEMM) : A GEMM tutorial.

- [AyakaGEMM/Hands-on-MLIR](https://github.com/AyakaGEMM/Hands-on-MLIR) : Hands-on-MLIR.

- [zpzim/MSplitGEMM](https://github.com/zpzim/MSplitGEMM) : Large matrix multiplication in CUDA.

- [jundaf2/CUDA-INT8-GEMM](https://github.com/jundaf2/CUDA-INT8-GEMM) : CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API.

- [chanzhennan/cuda_gemm_benchmark](https://github.com/chanzhennan/cuda_gemm_benchmark) : Base on gtest/benchmark, refer to [https://github.com/Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU).

- [YuxueYang1204/CudaDemo](https://github.com/YuxueYang1204/CudaDemo) : Implement custom operators in PyTorch with cuda/c++.

- [CoffeeBeforeArch/cuda_programming](https://github.com/CoffeeBeforeArch/cuda_programming) : Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch.

- [rbaygildin/learn-gpgpu](https://github.com/rbaygildin/learn-gpgpu) : Algorithms implemented in CUDA + resources about GPGPU.

- [godweiyang/NN-CUDA-Example](https://github.com/godweiyang/NN-CUDA-Example) : Several simple examples for popular neural network toolkits calling custom CUDA operators.

- [yhwang-hub/Matrix_Multiplication_Performance_Optimization](https://github.com/yhwang-hub/Matrix_Multiplication_Performance_Optimization) : Matrix Multiplication Performance Optimization.

- [yao-jiashu/KernelCodeGen](https://github.com/yao-jiashu/KernelCodeGen) : GEMM/Conv2d CUDA/HIP kernel code generation using MLIR.

- [caiwanxianhust/ClusteringByCUDA](https://github.com/caiwanxianhust/ClusteringByCUDA) : 使用 CUDA C++ 实现的一系列聚类算法。

- [ulrichstern/cuda-convnet](https://github.com/ulrichstern/cuda-convnet) : Alex Krizhevsky's original code from Google Code. "微信公众号「人工智能大讲堂」《[找到了AlexNet当年的源代码,没用框架,从零手撸CUDA/C++](https://mp.weixin.qq.com/s/plxXG8y5QlxSionyjyPXqw)》"。

- [PacktPublishing/Learn-CUDA-Programming](https://github.com/PacktPublishing/Learn-CUDA-Programming) : Learn CUDA Programming, published by Packt.

- [PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA) : Hands-On GPU Programming with Python and CUDA, published by Packt.

- [PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA) : Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt.

- [codingonion/cuda-beginner-course-cpp-version](https://github.com/codingonion/cuda-beginner-course-cpp-version) : bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码。

- [codingonion/cuda-beginner-course-python-version](https://github.com/codingonion/cuda-beginner-course-python-version) : bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码。

- [codingonion/cuda-beginner-course-rust-version](https://github.com/codingonion/cuda-beginner-course-rust-version) : bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码。