Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with cuda-kernels

A curated list of projects in awesome lists tagged with cuda-kernels .

https://github.com/nvidia/cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

cuda cuda-driver-api cuda-kernels cuda-opengl

Last synced: 17 Dec 2024

https://github.com/NVIDIA/cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

cuda cuda-driver-api cuda-kernels cuda-opengl

Last synced: 27 Oct 2024

https://github.com/internlm/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

codellama cuda-kernels deepspeed fastertransformer internlm llama llama2 llama3 llm llm-inference turbomind

Last synced: 22 Dec 2024

https://github.com/InternLM/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

codellama cuda-kernels deepspeed fastertransformer internlm llama llama2 llama3 llm llm-inference turbomind

Last synced: 28 Oct 2024

https://github.com/laugh12321/TensorRT-YOLO

🚀 你的YOLO部署神器。TensorRT Plugin、CUDA Kernel、CUDA Graphs三管齐下,享受闪电般的推理速度。| Your YOLO Deployment Powerhouse. With the synergy of TensorRT Plugins, CUDA Kernels, and CUDA Graphs, experience lightning-fast inference speeds.

cuda cuda-graph cuda-kernels cuda-programming detection onnx ppyoloe tensorrt yolov10 yolov3 yolov5 yolov6 yolov7 yolov8 yolov9

Last synced: 27 Oct 2024

https://github.com/nvidia/nvbench

CUDA Kernel Benchmarking Library

benchmark cuda cuda-kernels gpu kernel-benchmark nvidia performance

Last synced: 21 Dec 2024

https://github.com/NVIDIA/nvbench

CUDA Kernel Benchmarking Library

benchmark cuda cuda-kernels gpu kernel-benchmark nvidia performance

Last synced: 19 Nov 2024

https://github.com/harrism/hemi

Simple utilities to enable code reuse and portability between CUDA C/C++ and standard C/C++.

c-plus-plus cuda cuda-device cuda-kernels gpu hemi

Last synced: 16 Dec 2024

https://github.com/hmunachi/cuda-repo

From zero to hero CUDA for accelerating maths and machine learning on GPU.

cuda cuda-kernels cuda-programming machine-learning maths

Last synced: 21 Dec 2024

https://github.com/HMUNACHI/cuda-repo

From zero to hero CUDA for accelerating maths and machine learning on GPU.

cuda cuda-kernels cuda-programming machine-learning maths

Last synced: 12 Nov 2024

https://github.com/deepakkumar1984/Amplifier.NET

Amplifier allows .NET developers to easily run complex applications with intensive mathematical computation on Intel CPU/GPU, NVIDIA, AMD without writing any additional C kernel code. Write your function in .NET and Amplifier will take care of running it on your favorite hardware.

compiler cuda-kernels gpgpu gpgpu-computing gpgpu-sim opencl opencl-kernels simd

Last synced: 26 Oct 2024

https://github.com/patwie/cuda-design-patterns

Some CUDA design patterns and a bit of template magic for CUDA

bazel cpp11 cuda cuda-development cuda-device cuda-kernels cuda-utils gpu template-metaprogramming

Last synced: 01 Nov 2024

https://github.com/yalue/cuda_scheduling_examiner_mirror

A tool for examining GPU scheduling behavior.

benchmark cuda cuda-kernels gpu gpu-scheduling mandelbrot

Last synced: 16 Dec 2024

https://github.com/emptysoal/cuda-image-preprocess

Speed up image preprocess with cuda when handle image or tensorrt inference

cnn cuda cuda-demo cuda-kernels cuda-programming deep-learning image-processing tensorrt

Last synced: 06 Dec 2024

https://github.com/stellar-group/octotiger

Astrophysics program simulating the evolution of star systems based on the fast multipole method on adaptive Octrees

astrophysics cuda cuda-kernels hpx kokkos simd stellar-mergers sycl

Last synced: 12 Nov 2024

https://github.com/STEllAR-GROUP/octotiger

Astrophysics program simulating the evolution of star systems based on the fast multipole method on adaptive Octrees

astrophysics cuda cuda-kernels hpx kokkos simd stellar-mergers sycl

Last synced: 05 Nov 2024

https://github.com/ashvardanian/scaling-democracy

GPU-accelerated Schulze voting method in Python, Numba, and CUDA, using ideas from Algebraic Graph Theory

cuda cuda-kernels dynamic-programming gpgpu graph-algorithms graph-theory pybind11 python voting

Last synced: 07 Nov 2024

https://github.com/lawmurray/gpu-gemm

CUDA kernel for matrix-matrix multiplication on Nvidia GPUs, using a Hilbert curve to improve L2 cache utilization.

cplusplus cuda cuda-kernels cuda-programming gpu gpu-computing gpu-programming matrix-multiplication numerical-methods scientific-computing

Last synced: 01 Nov 2024

https://github.com/l4nos/php-cuda

An extesnion for PHP allowing it to access GPU operations on CUDA graphics cards (NVIDIA)

cuda cuda-kernels cuda-php php php-dll php-ext php-extension

Last synced: 18 Dec 2024

https://github.com/nrmancuso/big-bang

CUDA and OpenMp NBody simulation based on data from the Milky Way and Andromeda Galaxies

c cuda-kernels cuda-programming nbody-simulation openmp-parallelization parallel-computing space

Last synced: 03 Dec 2024

https://github.com/sergeipapina/color2graycuda

color to gray image conversion nvidia CUDA kernel implementation using make or cmake to compile and link

cmake cuda cuda-kernels cuda-programming link makefile nvidia

Last synced: 20 Dec 2024

https://github.com/chrisdalvit/gpu-matrix-transpose

Implementation and benchmarking of different matrix transpose with CUDA

c cpp cuda cuda-kernels cuda-programming gpu-acceleration gpu-computing gpu-programming matrix-transpose nvidia-gpu

Last synced: 20 Dec 2024

https://github.com/hrolive/fundamentals-of-accelerated-computing-with-cuda-c-cpp

Accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques.

cpp cuda cuda-kernels cuda-programming nsight nvidia profilling

Last synced: 09 Nov 2024

https://github.com/sahil-rajwar-2004/vector-cuda

vector calculation with GPU acceleration using CUDA

c cpp11 cuda cuda-kernels cuda-programming nvcc

Last synced: 19 Nov 2024

https://github.com/0xhilsa/vector-cuda

vector calculation with GPU acceleration using CUDA

c cpp11 cuda cuda-kernels cuda-programming nvcc

Last synced: 15 Dec 2024

https://github.com/nvaranki/cmmx

CUDA matrix multiplication (official guide, modified)

cuda cuda-kernels

Last synced: 10 Dec 2024

https://github.com/alexkranias/triton_vs_cuda

Building Triton and CUDA kernels side-by-side to create a cuBLAS-performant GEMM kernel.

cuda cuda-kernels gpu gpu-programming parallel-programming python triton

Last synced: 10 Dec 2024

https://github.com/tomtolleson/cuda-kernel-benchmarking-tool

A benchmarking tool in C++ that creates Cuda kernels and tests the overall system performance between CPU and GPU

cuda cuda-kernels cuda-support cuda-toolkit nvidia nvidia-cuda nvidia-gpu

Last synced: 11 Dec 2024