Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/roflmaostc/radonka.jl

A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbstractions.jl. Runs on CPU, CUDA, ...

automatic-differentiation computed-tomography ct cuda gpu julia julia-language optimization radon radon-transform tomography x-ray

Last synced: 12 Oct 2024

https://github.com/tcoppex/cudaraster-linux

Linux port of cudaraster, Nvidia's GPU rasterizer.

cuda gpu rasterizer

Last synced: 16 Nov 2024

https://github.com/wrathematics/proginfo

A small utility for getting some info post-hoc about a program's run.

cuda gpu nvidia profiler profiling

Last synced: 22 Oct 2024

https://github.com/ktaletsk/gpu_dsm

🔗Accessible quantitative polymer rheology predictions with slip-links on GPU

c-plus-plus cuda gpu polymer rheology

Last synced: 31 Dec 2024

https://github.com/joaomlneto/cpds-heat

Heat Equation using different solvers (Jacobi, Red-Black, Gaussian) in C using different paradigms (sequential, OpenMP, MPI, CUDA) - Assignments for the Concurrent, Parallel and Distributed Systems course @ UPC 2013

cuda cuda-support gauss-seidel gaussian heat-equation jacobi mpi mpi-applications openmp openmp-applications openmp-parallelization openmp-support openmpi paradigms performance red-black solvers

Last synced: 09 Nov 2024

https://github.com/maxilevi/raytracer

C++ raytracer that supports custom models. Supports running the calculations on the CPU using C++11 threads or in the GPU via CUDA.

bvh cuda graphics-programming intersection raytracer

Last synced: 11 Nov 2024

https://github.com/zephirfxec/hnanosolver

Houdini GPU Fluid Solver powered by NanoVDB

cpp cuda fluid-dynamics houdini nanovdb openvdb

Last synced: 23 Oct 2024

https://github.com/NAGAGroup/Scalix

Scalix is a data parallel compute library that automatically scales to the available compute resources.

cuda hpc scientific-computing

Last synced: 02 Nov 2024

https://github.com/aniketsingh03/processing-history-of-images

:bulb: Detecting processing history of images by using Deep Learning

cuda deep-learning image-forensics matlab python3 pytorch

Last synced: 19 Dec 2024

https://github.com/Christophe-Foyer/darknet_wsl_cuda_install_scripts

Install scripts for Darknet and OpenCV with CUDA support on WSL

cuda darknet opencv wsl wsl2

Last synced: 24 Oct 2024

https://github.com/enp1s0/shgemm

Fast multiplication of single-precision and half-precision matrices on Tensor Cores

cuda

Last synced: 26 Dec 2024

https://github.com/ghost---shadow/near-duplicate-image-detector

CUDA implementation of some perceptual hashing algorithms

cuda image-hashing thrust

Last synced: 11 Oct 2024

https://github.com/tillahoffmann/universal_tensorflow_image

Develop tensorflow models with or without a GPU accelerator using the same Docker image. 🥳

cuda nvidia-docker tensorflow

Last synced: 11 Oct 2024

https://github.com/adamtiger/tinygpulang

Tutorial on building a gpu compiler backend in LLVM

cuda llvm

Last synced: 14 Oct 2024

https://github.com/yhmtsai/ci_windows_cuda

This Repo creates the dockerfiles for using cuda in windows docker and provides the gitlab/github windows shared vm runner config.

continuous-integration cuda docker github-actions gitlab windows

Last synced: 27 Nov 2024

https://github.com/jacobtomlinson/advent-of-gpu-code-2020

Solutions for Advent of Code 2020 written for the GPU in Python

advent-of-code cuda gpu jupyter-notebooks numba python

Last synced: 29 Oct 2024

https://github.com/yalue/cudabrot

A CUDA renderer for the Buddhabrot fractal

amd buddhabrot buddhabrot-fractal cuda gpu hip mandelbrot mandelbrot-fractal rocm

Last synced: 23 Oct 2024

https://github.com/neka-nat/cuimage

Rust implementation of image processing library with CUDA

computer-vision cuda rust

Last synced: 14 Oct 2024

https://github.com/pkestene/cuda-proj-tmpl

A minimal cmake based project skeleton for developping a CUDA application

cea cmake cuda gpu gpu-computing parallel-computing parallel-programming template

Last synced: 18 Dec 2024

https://github.com/elftausend/gradients

Deep Learning library written in Rust (OpenCL, CUDA & CPU)

cpu cuda deep-learning gpu gpu-acceleration machine-learning mlp neural-networks opencl rust

Last synced: 08 Dec 2024

https://github.com/lanl/stcuda

StCUDA allows Smalltalk to call CUDA Driver APIs to do GPU computing

cuda smalltalk visualworks

Last synced: 09 Dec 2024

https://github.com/bokutotu/zenu

Deep Learning Framework Written in Rust

ai autograd blas cublas cuda cudnn deep-learning deep-neural-networks gpu-computing hpc rust

Last synced: 15 Dec 2024

https://github.com/superlinear-ai/python-gpu

🐳 Python GPU adds a minimal install of CUDA and cuDNN on top of the official python:3.x-slim base image

cuda cudnn docker docker-image python

Last synced: 11 Nov 2024

https://github.com/bhattbhavesh91/cudf-rapids-demo

A simple demo of cuDF which is a RAPIDS GPU-Accelerated Dataframe Library!

arrow cuda cudf demo gpu gpu-dataframe pandas python rapids

Last synced: 16 Nov 2024

https://github.com/p-ranav/vulkan-earth

Vulkan-based 3D Rendering of Earth

3d cuda engine gpu rendering simulation vulkan

Last synced: 13 Nov 2024

https://github.com/dancing-ui/uestc_vhm

使用yolov8、fast-reid、deepsort完成目标跟踪

cuda deepsort dockerfile fast-reid tensorrt yolov8n

Last synced: 23 Oct 2024

https://github.com/jatinx/pyhip

Python Interface to HIP and hiprtc Library

bindings cuda gpu hip hiprtc python rocm

Last synced: 23 Oct 2024

https://github.com/benediktalkin/kappaprofiler

lightweight simple profiling for python/pytorch

cuda profiler python pytorch

Last synced: 09 Nov 2024

https://github.com/abus-aikorea/aria-coversong

The best gradio web-ui for creating cover song that uses mdx-net and rvc. Easy one click installation. Fully portable.

cuda demucs gradio karaoke mdx-net nvidia python pytorch rvc song-covers uvr vocal-remover voice-conversion

Last synced: 10 Nov 2024

https://github.com/valentingol/deep-learning-installation

This tutorial provide a step-by-step pipeline to install an effective Python set-up optimized for deep learning for Ubuntu LTS, containing libraries to use efficiently the last versions of Tensorflow and Pytorch with the GPU and a comfortable environment of work with flexible and highly customizable IDE (VSCode) and environment manager (Virtualenv/VirtualenvWrapper).

cuda deep-learning deep-learning-library deep-learning-tutorial setup tutorial virtualenv virtualenvwrapper vscode

Last synced: 11 Oct 2024

https://github.com/mnicely/computeworks_examples

Matrix multiplication example performed with OpenMP, OpenACC, BLAS, cuBLABS, and CUDA

blas cublas cuda docker eclipse-plugin nsight nvidia nvidia-docker openacc openmp pgi-compiler

Last synced: 15 Oct 2024

https://github.com/dusanerdeljan/stereo-depth

Bachelor thesis - GPU accelerated single view passive stereo depth estimation pipeline

convolutional-neural-networks cuda depth-estimation pytorch real-time stereo-matching stereo-vision

Last synced: 11 Oct 2024

https://github.com/jundaf2/gpu-tensor-permute

permute sequence data on GPU with high bandwidth

cuda gpu-acceleration sequence-to-sequence

Last synced: 15 Nov 2024

https://github.com/torinos-yt/nnonnx

Using CUDA for Faster Machine Learning Inference on Unity

cuda machine-learning onnxruntime unity

Last synced: 20 Nov 2024

https://github.com/dendenxu/bvh-ray-tracing

CUDA Ray Tracing using BVH. Forked and modified from https://github.com/YuliangXiu/bvh-distance-queries

bvh cuda pytorch ray-tracing ray-triangle-intersection

Last synced: 27 Jan 2025

https://github.com/tigercosmos/simple-vgg16-cu

Simple VGG16 implemented in CUDA

cublas cuda cudnn vgg16

Last synced: 15 Oct 2024

https://github.com/rmiguelkelly/quickcluster

A KMeans implemented in C++ with Python bindings and GPU acceleration

clustering clustering-algorithm cpp cuda gpu kmeans kmeans-clustering metal objective-c python python3 unsupervised-learning

Last synced: 12 Oct 2024

https://github.com/frgfm/torch-cuda-template

Template for CUDA / C++ extension writing with PyTorch

cpp cuda pytorch pytorch-extension

Last synced: 05 Dec 2024

https://github.com/microsoft/hat

TOML-annotated C header file format for packaging binary files, from Microsoft Research

benchmarking cpp cprogramming cuda metadata platform-independent python-library rocm toml

Last synced: 12 Oct 2024

https://github.com/alesiong/template-matching

Simple template matching by GPU (CUDA)

computer-vision cuda template-matching

Last synced: 29 Nov 2024

https://github.com/pliablepixels/simpleyolo

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

cuda darknet opencv python3 yolov3

Last synced: 11 Oct 2024

https://github.com/cms-patatrack/cluestering

Density-based clustering algorithm developed at CERN

alpaka cern clustering cpp cuda pybind11 python tbb

Last synced: 30 Oct 2024

https://github.com/romnn/microgpusim

Cycle-level, trace-driven, parallel GPU simulator for NVIDIA Pascal.

cuda cycle-level design-space-exploration gpgpu gpu nvbit nvidia performance-engineering rust simulation trace-driven

Last synced: 18 Nov 2024

https://github.com/koesie10/gpjson

GPU-based JSON data processing system accessible via all GraalVM languages

cuda gpu graalvm json jsonpath

Last synced: 24 Oct 2024

https://github.com/chrxh/alien-docs

Documentation for ALIEN

cuda evolution physics-simulation simulation

Last synced: 15 Oct 2024

https://github.com/nolmoonen/jpeggpu

Low-latency CUDA JPEG decoder by parallelizing Huffman decoding

cuda huffman jpeg

Last synced: 23 Oct 2024

https://github.com/marcogarlet/cuda_cubeattack

CUDA implementation of Cube Attack

cryptography cubeattack cuda

Last synced: 11 Oct 2024

https://github.com/neoblizz/hip_template

🖤 Template for starting HIP/C++ project using CMake with Github Action for CI.

cpp cuda cuda-programming gpgpu gpu hip rocm template-project template-repository

Last synced: 30 Oct 2024

https://github.com/radenmuaz/slope-ad

A small automatic differentiation engine, supporting higher-order derivatives

array autograd automatic-differentiation cuda gradient iree jvp machine-learning metal mlir onnx onnxruntime tensor vjp

Last synced: 08 Dec 2024

https://github.com/hrntsm/ghgpucomputingtest

Test using CUDA with Alea GPU in grasshopper.

cuda grasshopper3d

Last synced: 27 Nov 2024

https://github.com/NCAR/micm

A model-independent chemistry module for atmosphere models

atmospheric-chemistry atmospheric-modeling atmospheric-science cuda gpu gpu-acceleration hpc ode-solver

Last synced: 27 Nov 2024

https://github.com/phineas-pta/nvidia-win

NVIDIA’s deep learning stack on Windows: CUDA toolkit + cuDNN + TensorRT

cuda cudnn guide tensorrt tutorial windows

Last synced: 14 Oct 2024

https://github.com/pinto0309/realsense-cuda-opengl-docker

RealSense execution environment built on a Docker container on Ubuntu 20.04. NIVIDA GPU and OpenGL capable. CUADA 11.4.

cuda docker opengl realsense realsense2 ubuntu wsl2

Last synced: 29 Oct 2024

https://github.com/gapi505/sparky-2

This is a discord bot running on llama cpp with the llama 3 model and image geneartion

ai cuda llama3 llamacpp stable-diffusion torch transformers

Last synced: 24 Jan 2025

https://github.com/pyhf/cuda-images

pyhf Docker images built on Nvidia Container Toolkit enabled base images

cuda jax nvidia nvidia-cuda nvidia-docker pyhf

Last synced: 23 Nov 2024

https://github.com/raymondcm/blockmatching

CPU and CUDA implementation of Full Exhaustive Block Matching Algorithm using Integral Images

block-matching-algorithm cuda integral-image parallel vision

Last synced: 21 Oct 2024

https://github.com/coderonion/cuda-beginner-course-python-version

bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码

cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 19 Nov 2024

https://github.com/theochem/cugbasis

High performance CUDA/Python library for computing quantum chemistry density-based descriptors for larger systems using GPUs.

atoms-in-molecules computational-chemistry conceptual-dft cuda electron-density gpu python qtaim quantum quantum-chemistry theoretical-chemistry

Last synced: 17 Nov 2024

https://github.com/rogerallen/smandelbrotr

SDL2 CUDA OpenGL Mandelbrot explorer.

cuda mandelbrot-viewer opengl sdl2

Last synced: 25 Nov 2024

https://github.com/drsnowbird/cuda-pytorch-docker

Nvidia CUDA for GPU + PyTorch (latest) in Docker

cuda deep-learning docker gpu jupyter-notebook nvidia-gpu pytorch ssl-proxy

Last synced: 14 Nov 2024

https://github.com/taeguk/dist-prog-assignment

Sogang Univ. Distributed Programming (CSE5414) Assignments.

assignment cuda distributed mpi-library openmp parallel pthreads sogang

Last synced: 03 Dec 2024

https://github.com/tudasc/cusan

A data race detector for CUDA C and C++ based on ThreadSanitizer

c cpp cuda datarace threadsanitizer

Last synced: 14 Dec 2024

https://github.com/anantzoid/cuda-genetic-algorithm-travelling-salesman-problem

Implementation of Parallel Genetic Algorithm in CUDA to solve TSP (Berlin52)

c cuda genetic-algorithm tsp tsp-solver

Last synced: 01 Dec 2024

https://github.com/bkraad47/fat_llama

fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT, resulting in richer and more detailed audio.

audio audio-engineering audio-processing audiophile cuda cufft cupy fft flac hi-res hpc mp3 music nvidia ogg parallel-computing physics upscaling wav

Last synced: 23 Oct 2024

https://github.com/belval/raytracing

Using CUDA to implement "Raytracing in one weekend" by Peter Shirley

cuda raytracing raytracing-in-one-weekend

Last synced: 14 Oct 2024

https://github.com/rocm/numba-hip

HIP backend patch for Numba, the NumPy aware dynamic Python compiler using LLVM.

ai compiler cuda gpu hip hpc jit ml numba python radeon-instinct-mi-series rocm

Last synced: 28 Dec 2024

https://github.com/acecoooool/cs344-note

CS344-Note-zh

c cuda gpu

Last synced: 09 Nov 2024

https://github.com/alpaka-group/bactria

Broadly Applicable C++ Tracing and Instrumentation API :camel:

cuda hardware-counters instrumentation-api metrics rocm tracing-events

Last synced: 09 Nov 2024

https://github.com/bryanoliveira/cellular-automata

A cellular automata program built with C++, OpenGL, CUDA and OpenMP.

cellular-automata cuda life opengl openmp

Last synced: 03 Jan 2025

https://github.com/lukoshkin/hpc

Skoltech HPC course

cuda curand hpc mpi omp

Last synced: 07 Nov 2024

https://github.com/elsa-lab/base-env

Basis of ELSA computational platform

cuda machine-learning server-utility ubuntu

Last synced: 11 Nov 2024

https://github.com/pkestene/tsp

traveling salesman problem solved with different programing models

cea cpp cuda kokkos nvidia-gpu openacc openmp performance-portability stdpar sycl

Last synced: 18 Dec 2024

https://github.com/potato3d/grid

GPU-accelerated uniform grid construction for ray tracing

cuda glsl gpu grid ray-tracing

Last synced: 10 Jan 2025

https://github.com/neoheartbeats/neoheartbeats

An architecture for LLMs' continual-learning and long-term memories

cuda fine-tuning llama-factory llm

Last synced: 15 Nov 2024

https://github.com/jpuigcerver/nnutils

CPU & CUDA implementation of several neural network utils

cuda deep-learning neural-networks openmp pytorch

Last synced: 02 Dec 2024

https://github.com/cascadingradium/cuda-hungarian-clustering

A GPU-Accelerated Clustering Algorithm that uses the Hungarian method

clustering cpp cuda gpu hungarian-algorithm parallel-computing

Last synced: 19 Nov 2024

https://github.com/ashvardanian/scaling-democracy

GPU-accelerated Schulze voting method in Python, Numba, and CUDA, using ideas from Algebraic Graph Theory

cuda cuda-kernels dynamic-programming gpgpu graph-algorithms graph-theory pybind11 python voting

Last synced: 07 Nov 2024

https://github.com/pmeier/tox-ltt

Install PyTorch distributions with light-the-torch

cuda install light-the-torch pip plugin pytorch tox

Last synced: 22 Dec 2024

https://github.com/yuvix25/py2cuda

Convert Python 3 code to CUDA code.

converter cuda gpu gpu-acceleration python python3

Last synced: 05 Jan 2025

https://github.com/willigarneau/astar-pathfinding

🗺📌 Implementation of the A* pathfinding algorithm with OpenCV and Cuda in C++ 💪

a-star algorithm axis-camera cuda detection implementation opencv pathfinding

Last synced: 23 Nov 2024

https://github.com/jackeylea/cuda_linux

linux下cuda/qt教程

cpp cuda cudnn qt5

Last synced: 23 Jan 2025

https://github.com/bencardoen/singularity_slurm_cuda

Example on how to get started with Singularity and CUDA on a SLURM cluster

cuda nvidia singularity-container slurm-cluster tensorflow

Last synced: 23 Oct 2024

https://github.com/enp1s0/culip

Library for profiling the execution time of CUDA official library functions

cublas cuda profiling

Last synced: 06 Nov 2024

https://github.com/xiaohaoo/yolo_tensorrt

Deploy the YOLOv8 model for inference using OpenCV and TensorRT in C/C++.

c cuda opencv tensorrt yolov8

Last synced: 24 Dec 2024

https://github.com/aespinosadev/opengl-renderer

OpenGL renderer showcasing all basic functionality to render 3D scenes.

computer-graphics cuda gpgpu graphics-engine graphics-programming opengl rendering rendering-3d-graphics shaders video-game

Last synced: 23 Jan 2025

https://github.com/cascadingradium/air-traffic-distribution

A GPU-Accelerated Multi-Objective Genetic Algorithm for Air Traffic Management

air-traffic-control air-traffic-management c cuda genetic-algorithm gpu-acceleration

Last synced: 19 Nov 2024

https://github.com/enfiskutensykkel/cuda-rdma-bench

NVIDIA GPU direct RDMA using SISCI API

cuda dma gpudirect-rdma pcie rdma sisci

Last synced: 01 Nov 2024

https://github.com/rfsantacruz/mycudasamples

This is a series of CUDA C++ programming samples developed to study CUDA technology and its parallel programming model.

cpp cuda gpgpu

Last synced: 07 Nov 2024

https://github.com/elinliu0/studentbehaviordetection

沈阳大学-学生行为检测代码仓库(基于YoloV8+CVCUDA+TensorRT)

cuda cv-cuda python tensorrt yolov8

Last synced: 11 Nov 2024