Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/coderonion/cuda-beginner-course-cpp-version
bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
https://github.com/coderonion/cuda-beginner-course-cpp-version
cpp cublas cuda cuda-programming cudnn gpu gpu-programming nvcc nvidia parallel-programming python rust
Last synced: about 1 month ago
JSON representation
bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
- Host: GitHub
- URL: https://github.com/coderonion/cuda-beginner-course-cpp-version
- Owner: coderonion
- License: mit
- Created: 2024-01-21T09:18:28.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-08-12T12:50:10.000Z (4 months ago)
- Last Synced: 2024-10-06T13:02:35.838Z (3 months ago)
- Topics: cpp, cublas, cuda, cuda-programming, cudnn, gpu, gpu-programming, nvcc, nvidia, parallel-programming, python, rust
- Language: Cuda
- Homepage: https://www.bilibili.com/video/BV1Sj411H7Qq/
- Size: 20.5 KB
- Stars: 25
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-cuda-triton-hpc - codingonion/cuda-beginner-course-cpp-version - beginner-course-cpp-version?style=social"/> : bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码。 (Learning Resources)
- awesome-cuda-triton-hpc - codingonion/cuda-beginner-course-cpp-version - beginner-course-cpp-version?style=social"/> : bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码。 (Learning Resources)
README
# CUDA-Beginner-Course-CPP-Version
# CUDA 12.x 并行编程入门(C++版)***Note that this repository is under active development.***
## Progress
| Section | Videos | Codes |
| :------ | :----------------------------------------------------------- | :-------------------------------------------- |
| 01 | [第1集 CUDA介绍及Windows开发环境安装](https://www.bilibili.com/video/BV1Sj411H7Qq/) | / |
| 02 | [第2集 Ubuntu系统下安装CUDA开发环境](https://www.bilibili.com/video/BV1je411U7yX/) | / |
| 03 | [第3集 Windows和Ubuntu下运行第一个CUDA程序](https://www.bilibili.com/video/BV1oc411x7Gt/) | [course01_hello_cuda](./course01_hello_cuda/) |
| 04 | [第4集 你好, CUDA!](https://www.bilibili.com/video/BV1jueweLEQ1/) | [course01_hello_cuda](./course01_hello_cuda/) |
| | | |## Todo
- [ ] ...
## Acknowledgements
Thanks for the following excellent public learning resources.
- [codingonion/awesome-cuda-and-hpc](https://github.com/codingonion/awesome-cuda-and-hpc) : A collection of some awesome public CUDA, cuBLAS, TensorRT and High Performance Computing (HPC) projects.
- [NVIDIA CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/) : CUDA Toolkit Documentation.
- [NVIDIA CUDA C++ Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) : CUDA C++ Programming Guide.
- [NVIDIA CUDA C++ Best Practices Guide](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html) : CUDA C++ Best Practices Guide.
- [NVIDIA/cuda-samples](https://github.com/NVIDIA/cuda-samples) : Samples for CUDA Developers which demonstrates features in CUDA Toolkit.
- [NVIDIA/CUDALibrarySamples](https://github.com/NVIDIA/CUDALibrarySamples) : CUDA Library Samples.
- [NVIDIA-developer-blog/code-samples](https://github.com/NVIDIA-developer-blog/code-samples) : Source code examples from the [Parallel Forall Blog](http://developer.nvidia.com/parallel-forall).
- [HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese](https://github.com/HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese) : This is a Chinese translation of the CUDA programming guide. 本项目为 CUDA C Programming Guide 的中文翻译版。
- [cuda-mode/lectures](https://github.com/cuda-mode/lectures) : Material for cuda-mode lectures.
- [cuda-mode/resource-stream](https://github.com/cuda-mode/resource-stream) : CUDA related news and material links.
- [brucefan1983/CUDA-Programming](https://github.com/brucefan1983/CUDA-Programming) : Sample codes for my CUDA programming book.
- [YouQixiaowu/CUDA-Programming-with-Python](https://github.com/YouQixiaowu/CUDA-Programming-with-Python) : 关于书籍CUDA Programming使用了pycuda模块的Python版本的示例代码。
- [QINZHAOYU/CudaSteps](https://github.com/QINZHAOYU/CudaSteps) : 基于《cuda编程-基础与实践》(樊哲勇 著)的cuda学习之路。
- [sangyc10/CUDA-code](https://github.com/sangyc10/CUDA-code) : bilibili视频【CUDA编程基础入门系列(持续更新)】配套代码。
- [RussWong/CUDATutorial](https://github.com/RussWong/CUDATutorial) : A CUDA tutorial to make people learn CUDA program from 0.
- [DefTruth//CUDA-Learn-Notes](https://github.com/DefTruth/CUDA-Learn-Notes) : 🎉CUDA/C++ 笔记 / 大模型手撕CUDA / 技术博客,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
- [BBuf/how-to-optim-algorithm-in-cuda](https://github.com/BBuf/how-to-optim-algorithm-in-cuda) : how to optimize some algorithm in cuda.
- [PaddleJitLab/CUDATutorial](https://github.com/PaddleJitLab/CUDATutorial) : A self-learning tutorail for CUDA High Performance Programing. 从零开始学习 CUDA 高性能编程。
- [leimao/CUDA-GEMM-Optimization](https://github.com/leimao/CUDA-GEMM-Optimization) : [CUDA Matrix Multiplication Optimization](https://leimao.github.io/article/CUDA-Matrix-Multiplication-Optimization/). This repository contains the CUDA kernels for general matrix-matrix multiplication (GEMM) and the corresponding performance analysis.
- [Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU) : This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
- [Bruce-Lee-LY/matrix_multiply](https://github.com/Bruce-Lee-LY/matrix_multiply) : Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.
- [Bruce-Lee-LY/cuda_hgemm](https://github.com/Bruce-Lee-LY/cuda_hgemm) : Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
- [Bruce-Lee-LY/cuda_hgemv](https://github.com/Bruce-Lee-LY/cuda_hgemv) : Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
- [enp1s0/ozIMMU](https://github.com/enp1s0/ozIMMU) : FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme. [arxiv.org/abs/2306.11975](https://arxiv.org/abs/2306.11975)
- [Cjkkkk/CUDA_gemm](https://github.com/Cjkkkk/CUDA_gemm) : A simple high performance CUDA GEMM implementation.
- [AyakaGEMM/Hands-on-GEMM](https://github.com/AyakaGEMM/Hands-on-GEMM) : A GEMM tutorial.
- [AyakaGEMM/Hands-on-MLIR](https://github.com/AyakaGEMM/Hands-on-MLIR) : Hands-on-MLIR.
- [zpzim/MSplitGEMM](https://github.com/zpzim/MSplitGEMM) : Large matrix multiplication in CUDA.
- [jundaf2/CUDA-INT8-GEMM](https://github.com/jundaf2/CUDA-INT8-GEMM) : CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API.
- [chanzhennan/cuda_gemm_benchmark](https://github.com/chanzhennan/cuda_gemm_benchmark) : Base on gtest/benchmark, refer to [https://github.com/Liu-xiandong/How_to_optimize_in_GPU](https://github.com/Liu-xiandong/How_to_optimize_in_GPU).
- [YuxueYang1204/CudaDemo](https://github.com/YuxueYang1204/CudaDemo) : Implement custom operators in PyTorch with cuda/c++.
- [CoffeeBeforeArch/cuda_programming](https://github.com/CoffeeBeforeArch/cuda_programming) : Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch.
- [rbaygildin/learn-gpgpu](https://github.com/rbaygildin/learn-gpgpu) : Algorithms implemented in CUDA + resources about GPGPU.
- [godweiyang/NN-CUDA-Example](https://github.com/godweiyang/NN-CUDA-Example) : Several simple examples for popular neural network toolkits calling custom CUDA operators.
- [yhwang-hub/Matrix_Multiplication_Performance_Optimization](https://github.com/yhwang-hub/Matrix_Multiplication_Performance_Optimization) : Matrix Multiplication Performance Optimization.
- [yao-jiashu/KernelCodeGen](https://github.com/yao-jiashu/KernelCodeGen) : GEMM/Conv2d CUDA/HIP kernel code generation using MLIR.
- [caiwanxianhust/ClusteringByCUDA](https://github.com/caiwanxianhust/ClusteringByCUDA) : 使用 CUDA C++ 实现的一系列聚类算法。
- [ulrichstern/cuda-convnet](https://github.com/ulrichstern/cuda-convnet) : Alex Krizhevsky's original code from Google Code. "微信公众号「人工智能大讲堂」《[找到了AlexNet当年的源代码,没用框架,从零手撸CUDA/C++](https://mp.weixin.qq.com/s/plxXG8y5QlxSionyjyPXqw)》"。
- [PacktPublishing/Learn-CUDA-Programming](https://github.com/PacktPublishing/Learn-CUDA-Programming) : Learn CUDA Programming, published by Packt.
- [PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Programming-with-Python-and-CUDA) : Hands-On GPU Programming with Python and CUDA, published by Packt.
- [PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA](https://github.com/PacktPublishing/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA) : Hands-On GPU Accelerated Computer Vision with OpenCV and CUDA, published by Packt.
- [codingonion/cuda-beginner-course-cpp-version](https://github.com/codingonion/cuda-beginner-course-cpp-version) : bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码。
- [codingonion/cuda-beginner-course-python-version](https://github.com/codingonion/cuda-beginner-course-python-version) : bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码。
- [codingonion/cuda-beginner-course-rust-version](https://github.com/codingonion/cuda-beginner-course-rust-version) : bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码。