Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with cuda-programming

A curated list of projects in awesome lists tagged with cuda-programming .

https://github.com/DefTruth/CUDA-Learn-Notes

🎉CUDA/C++ 笔记 / 大模型手撕CUDA / 技术博客,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

block-reduce cuda cuda-kernels cuda-programming elementwise flash-attention flash-attention-2 gemm gemv layernorm rmsnorm softmax warp-reduce

Last synced: 31 Jul 2024

https://github.com/laugh12321/TensorRT-YOLO

🚀 TensorRT-YOLO: Supports YOLOv3, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv9, YOLOv10, and PP-YOLOE using TensorRT acceleration with EfficientNMS, CUDA Kernels and CUDA Graphs!

cuda cuda-graph cuda-kernels cuda-programming detection onnx ppyoloe tensorrt yolov10 yolov3 yolov5 yolov6 yolov7 yolov8 yolov9

Last synced: 31 Jul 2024

https://github.com/nosferalatu/SimpleGPUHashTable

A simple GPU hash table implemented in CUDA using lock free techniques

cuda cuda-programming data-structures gpu gpu-cuda-programs

Last synced: 02 Aug 2024

https://github.com/HMUNACHI/cuda-repo

From zero to hero CUDA for accelerating maths and machine learning on GPU.

cuda cuda-kernels cuda-programming machine-learning maths

Last synced: 02 Aug 2024

https://github.com/MuGdxy/muda

μ-Cuda, COVER THE LAST MILE OF CUDA. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updating.

cuda cuda-cpp cuda-programming

Last synced: 04 Aug 2024

https://github.com/PaddleJitLab/CUDATutorial

A self-learning tutorail for CUDA High Performance Programing.

cuda-programming deep-learning

Last synced: 04 Aug 2024

https://github.com/LinhanDai/yolov9-tensorrt

YOLOv9 Tensorrt deployment acceleration,provide two implementation methods: C++and Python🔥🔥🔥

cpp cuda-programming python tensorrt yolov9

Last synced: 31 Jul 2024

https://github.com/Lin-Mao/DrGPUM

A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.

cuda-programming gpu-memory gpu-memory-profiler gpu-profiler memory-management

Last synced: 09 Aug 2024

https://github.com/codingonion/cuda-beginner-course-cpp-version

bilibili视频【CUDA 12.1 并行编程入门(C++语言版)】配套代码

cpp cublas cuda cuda-programming cudnn gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 04 Aug 2024

https://github.com/littlebearsama/xxCu3Dlibrary

cuda 加速3D点云算法库,持续更新(含cudaicp,glfw点云可视化等)

cuda-programming glfw3 pointcloud

Last synced: 31 Jul 2024

https://github.com/codingonion/cuda-beginner-course-python-version

bilibili视频【CUDA 12.1 并行编程入门(Python语言版)】配套代码

cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 04 Aug 2024

https://github.com/codingonion/cuda-beginner-course-rust-version

bilibili视频【CUDA 12.1 并行编程入门(Rust语言版)】配套代码

candle cpp cublas cuda cuda-programming cudarc cudnn gpu gpu-programming nvcc nvidia parellel-programming python rust

Last synced: 04 Aug 2024

https://github.com/pastekaztekastor/crowd-simulation

Le projet consiste en une simulation de foule sur une grille, avec des versions parallélisées sur carte graphique. L'objectif est de modéliser le mouvement des individus dans un environnement en utilisant des paramètres tels que la dimension de la grille, le nombre d'individus et exporte de résultat de chaque frame dans unfichier bin pour analyse.

c cmake cpp crowdsimulation cuda-programming graphicscard grid-layout ipynb make nvidia-gpu parallelization

Last synced: 27 Sep 2024

https://github.com/GCaptainNemo/Cuda-Image-Processing

Using CUDA GPU Programming to speed up image processing.

cuda-programming image-processing

Last synced: 31 Jul 2024

https://github.com/pavulurig/matrix-mul-pytorch-cuda-cpu-analysis

Compare the performances of the matrix multiplication on CPU and GPU with PyTorch cuda programming.

cuda-programming matrix-multiplication python3 pytorch

Last synced: 01 Oct 2024

https://github.com/sahil-rajwar-2004/vector-cuda

vector calculation with GPU acceleration using CUDA

c cpp11 cuda cuda-kernels cuda-programming nvcc

Last synced: 29 Sep 2024

https://github.com/pzaino/cpp-hpc

A collection of stuff for HPC in C++

coding cpp cpp17 cuda-programming hpc library opencl openmp

Last synced: 28 Sep 2024

https://github.com/bardiparsi/threadpoolmanager

ThreadPoolManager is a C++ project that implements an efficient multi-threading system using a thread pool for generic functions of the same type and different tasks. It includes task management, synchronization mechanisms, and thread-safe logging to demonstrate concurrent task execution.

cpp cpp17 cpp20 cuda cuda-programming memory-management multiprocessing multithreading parallel-computing parallel-processing parallel-programming thread thread-pool thread-safety threadpool threads threadsafe

Last synced: 28 Sep 2024