Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/coderonion/cuda-beginner-course-cpp-version
bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
https://github.com/coderonion/cuda-beginner-course-cpp-version
cpp cublas cuda cuda-programming cudnn gpu gpu-programming nvcc nvidia parallel-programming python rust
Last synced: 3 months ago
JSON representation
bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
- Host: GitHub
- URL: https://github.com/coderonion/cuda-beginner-course-cpp-version
- Owner: coderonion
- License: mit
- Created: 2024-01-21T09:18:28.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-12T12:50:10.000Z (6 months ago)
- Last Synced: 2024-10-06T13:02:35.838Z (4 months ago)
- Topics: cpp, cublas, cuda, cuda-programming, cudnn, gpu, gpu-programming, nvcc, nvidia, parallel-programming, python, rust
- Language: Cuda
- Homepage: https://www.bilibili.com/video/BV1Sj411H7Qq/
- Size: 20.5 KB
- Stars: 25
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-cuda-triton-hpc - codingonion/cuda-beginner-course-cpp-version - beginner-course-cpp-version?style=social"/> : bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码。 (Learning Resources)
- awesome-cuda-triton-hpc - codingonion/cuda-beginner-course-cpp-version - beginner-course-cpp-version?style=social"/> : bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码。 (Learning Resources)
README
# CUDA-Beginner-Course-CPP-Version
# CUDA 12.x 并行编程入门(C++版)***Note that this repository is under active development.***
## Progress
| Section | Videos | Codes |
| :------ | :----------------------------------------------------------- | :-------------------------------------------- |
| 01 | [第1集 CUDA介绍及Windows开发环境安装](https://www.bilibili.com/video/BV1Sj411H7Qq/) | / |
| 02 | [第2集 Ubuntu系统下安装CUDA开发环境](https://www.bilibili.com/video/BV1je411U7yX/) | / |
| 03 | [第3集 Windows和Ubuntu下运行第一个CUDA程序](https://www.bilibili.com/video/BV1oc411x7Gt/) | [course01_hello_cuda](./course01_hello_cuda/) |
| 04 | [第4集 你好, CUDA!](https://www.bilibili.com/video/BV1jueweLEQ1/) | [course01_hello_cuda](./course01_hello_cuda/) |
| | | |## Todo
- [ ] ...
## Acknowledgements
Thanks for the following excellent public learning resources.
- [codingonion/awesome-cuda-and-hpc](https://github.com/codingonion/awesome-cuda-and-hpc)
: A collection of some awesome public CUDA, cuBLAS, TensorRT and High Performance Computing (HPC) projects.
- [NVIDIA CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/) : CUDA Toolkit Documentation.
- [NVIDIA CUDA C++ Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) : CUDA C++ Programming Guide.
- [NVIDIA CUDA C++ Best Practices Guide](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html) : CUDA C++ Best Practices Guide.
- [NVIDIA/cuda-samples](https://github.com/NVIDIA/cuda-samples)
: Samples for CUDA Developers which demonstrates features in CUDA Toolkit.
- [NVIDIA/CUDALibrarySamples](https://github.com/NVIDIA/CUDALibrarySamples)
: CUDA Library Samples.
- [NVIDIA-developer-blog/code-samples](https://github.com/NVIDIA-developer-blog/code-samples)
: Source code examples from the [Parallel Forall Blog](http://developer.nvidia.com/parallel-forall).
- [HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese](https://github.com/HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese)
: This is a Chinese translation of the CUDA programming guide. 本项目为 CUDA C Programming Guide 的中文翻译版。
- [cuda-mode/lectures](https://github.com/cuda-mode/lectures)
: Material for cuda-mode lectures.
- [cuda-mode/resource-stream](https://github.com/cuda-mode/resource-stream)
: CUDA related news and material links.
- [brucefan1983/CUDA-Programming](https://github.com/brucefan1983/CUDA-Programming)
: Sample codes for my CUDA programming book.
- [YouQixiaowu/CUDA-Programming-with-Python](https://github.com/YouQixiaowu/CUDA-Programming-with-Python)
: 关于书籍CUDA Programming使用了pycuda模块的Python版本的示例代码。
- [QINZHAOYU/CudaSteps](https://github.com/QINZHAOYU/CudaSteps)
: 基于《cuda编程-基础与实践》(樊哲勇 著)的cuda学习之路。
- [sangyc10/CUDA-code](https://github.com/sangyc10/CUDA-code)
: bilibili视频【CUDA编程基础入门系列(持续更新)】配套代码。
- [RussWong/CUDATutorial](https://github.com/RussWong/CUDATutorial)
: A CUDA tutorial to make people learn CUDA program from 0.
- [DefTruth//CUDA-Learn-Notes](https://github.com/DefTruth/CUDA-Learn-Notes)
: 🎉CUDA/C++ 笔记 / 大模型手撕CUDA / 技术博客,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
- [BBuf/how-to-optim-algorithm-in-cuda](https://github.com/BBuf/how-to-optim-algorithm-in-cuda)
: how to optimize some algorithm in cuda.
- [PaddleJitLab/CUDATutorial](https://github.com/PaddleJitLab/CUDATutorial)
: A self-learning tutorail for CUDA High Performance Programing. 从零开始学习 CUDA 高性能编程。
- [leimao/CUDA-GEMM-Optimization](https://github.com/leimao/CUDA-GEMM-Optimization)
: [CUDA Matrix Multiplication Optimization](https://leimao.github.io/article/CUDA-Matrix-Multiplication-Optimization/). This repository contains the CUDA kernels for general matrix-matrix multiplication (GEMM) and the corresponding performance analysis.
- [Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU)
: This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
- [Bruce-Lee-LY/matrix_multiply](https://github.com/Bruce-Lee-LY/matrix_multiply)
: Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.
- [Bruce-Lee-LY/cuda_hgemm](https://github.com/Bruce-Lee-LY/cuda_hgemm)
: Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
- [Bruce-Lee-LY/cuda_hgemv](https://github.com/Bruce-Lee-LY/cuda_hgemv)
: Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
- [enp1s0/ozIMMU](https://github.com/enp1s0/ozIMMU)
: FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme. [arxiv.org/abs/2306.11975](https://arxiv.org/abs/2306.11975)
- [Cjkkkk/CUDA_gemm](https://github.com/Cjkkkk/CUDA_gemm)
: A simple high performance CUDA GEMM implementation.
- [AyakaGEMM/Hands-on-GEMM](https://github.com/AyakaGEMM/Hands-on-GEMM)
: A GEMM tutorial.
- [AyakaGEMM/Hands-on-MLIR](https://github.com/AyakaGEMM/Hands-on-MLIR)
: Hands-on-MLIR.
- [zpzim/MSplitGEMM](https://github.com/zpzim/MSplitGEMM)
: Large matrix multiplication in CUDA.
- [jundaf2/CUDA-INT8-GEMM](https://github.com/jundaf2/CUDA-INT8-GEMM)
: CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API.
- [chanzhennan/cuda_gemm_benchmark](https://github.com/chanzhennan/cuda_gemm_benchmark)
: Base on gtest/benchmark, refer to [https://github.com/Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU).
- [YuxueYang1204/CudaDemo](https://github.com/YuxueYang1204/CudaDemo)
: Implement custom operators in PyTorch with cuda/c++.
- [CoffeeBeforeArch/cuda_programming](https://github.com/CoffeeBeforeArch/cuda_programming)
: Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch.
- [rbaygildin/learn-gpgpu](https://github.com/rbaygildin/learn-gpgpu)
: Algorithms implemented in CUDA + resources about GPGPU.
- [godweiyang/NN-CUDA-Example](https://github.com/godweiyang/NN-CUDA-Example)
: Several simple examples for popular neural network toolkits calling custom CUDA operators.
- [yhwang-hub/Matrix_Multiplication_Performance_Optimization](https://github.com/yhwang-hub/Matrix_Multiplication_Performance_Optimization)
: Matrix Multiplication Performance Optimization.
- [yao-jiashu/KernelCodeGen](https://github.com/yao-jiashu/KernelCodeGen)
: GEMM/Conv2d CUDA/HIP kernel code generation using MLIR.
- [caiwanxianhust/ClusteringByCUDA](https://github.com/caiwanxianhust/ClusteringByCUDA)
: 使用 CUDA C++ 实现的一系列聚类算法。
- [ulrichstern/cuda-convnet](https://github.com/ulrichstern/cuda-convnet)
: Alex Krizhevsky's original code from Google Code. "微信公众号「人工智能大讲堂」《[找到了AlexNet当年的源代码,没用框架,从零手撸CUDA/C++](https://mp.weixin.qq.com/s/plxXG8y5QlxSionyjyPXqw)》"。
- [PacktPublishing/Learn-CUDA-Programming](https://github.com/PacktPublishing/Learn-CUDA-Programming)
: Learn CUDA Programming, published by Packt.
- [PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA)
: Hands-On GPU Programming with Python and CUDA, published by Packt.
- [PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA)
: Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt.
- [codingonion/cuda-beginner-course-cpp-version](https://github.com/codingonion/cuda-beginner-course-cpp-version)
: bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码。
- [codingonion/cuda-beginner-course-python-version](https://github.com/codingonion/cuda-beginner-course-python-version)
: bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码。
- [codingonion/cuda-beginner-course-rust-version](https://github.com/codingonion/cuda-beginner-course-rust-version)
: bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码。