An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/xavierjiezou/gpu-compute-capability

An application for querying the computing power of each gpu released by NVIDIA.

cuda gpu nvidia

Last synced: 28 Apr 2026

https://github.com/gunrock/template

Template repository for essentials applications to get you started asap!

cpp cuda essentials gpu graph-algorithms graph-analytics gunrock

Last synced: 15 May 2026

https://github.com/blazekill/hello-cuda

Cpp + Vcpkg + CUDA + VsCode starter project.

cpp cuda vcpkg vscode

Last synced: 18 May 2026

https://github.com/mhaseeb123/gcb

GCB includes a suite of benchmarks and basic tests for CUDA-aware MPI and C++ compilers.

cpp cpp23 cuda mpi partitioned-communication st-mpi

Last synced: 17 May 2026

https://github.com/memergamer/cuda-fluid-simulation-with-interactive-visualization

A real-time fluid dynamics simulation implemented in Python using CUDA for GPU acceleration, featuring interactive ASCII visualization and automated movement patterns.

colab-notebook cuda liquid-simulations navier-stokes

Last synced: 18 May 2026

https://github.com/thomasonzhou/minitorch

rebuilding pytorch: from autograd to convolutions in CUDA

cuda numba numpy

Last synced: 02 Feb 2026

https://github.com/ssoehdata/cuda_fortran_sci_eng

Working through examples from the Cuda Fortran for Scientists and Engineers 2nd Edition Book

cuda cuda-fortran fortran hpc nvfortran

Last synced: 21 Aug 2025

https://github.com/leocelente/basic_cuda

My CUDA source files while learning

cpp cuda gpgpu

Last synced: 29 Apr 2026

https://github.com/asadiahmad/gesture-detection

Real-time Gesture Detection using CUDA-accelerated OpenCV in Python.

computer-vision cuda gesture-recognition gpu-acceleration open-pose opencv opencv-cuda pose-detection real-time

Last synced: 29 Apr 2026

https://github.com/anras5/parallel-computing

Comparing CPU and GPU

cuda gpu openmp

Last synced: 29 Apr 2026

https://github.com/nofaralfasi/parallel-sequence-alignment

A parallelized version of multiple DNA sequence alignment algorithm with MPI, OpenMP and CUDA

cuda mpi openmp sequence-alignment

Last synced: 29 Apr 2026

https://github.com/bl33h/pythagoreantheorem

A program that calculates the Pythagorean theorem for a large number of elements using GPU parallel processing.

arrays cuda kernel parallel-programming pythagoras pythagorean-theorem

Last synced: 19 May 2026

https://github.com/nickolasrm/gpuvscpumatrixmultiplication

CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)

avx comparison cuda hpc matrix multiplication

Last synced: 06 Sep 2025

https://github.com/brosnanyuen/raybnn_graph

Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

cuda gpu graph graph-algorithms neural-network neural-networks opencl raybnn rust

Last synced: 06 Feb 2026

https://github.com/tommaso-dognini/polimi_gpu101_courseproject

Polimi Passion In Action GPU101 course project. Implementation in CUDA of BFS algorithm

cpp cuda cuda-programming parallel-computing

Last synced: 10 Apr 2026

https://github.com/sarah627/horus_eye_fcih_graduation_project

An AI-powered tourism website using YOLOv7 for real-time landmark detection in images. Built with Flask, PyTorch, and Roboflow for seamless tourist interaction.

computer-vision cuda flask jupyter-notebook kaggle matplotlib object-detection opencv python pytorch roboflow

Last synced: 14 Apr 2026

https://github.com/duskvirkus/ofxarrayfire

An openFrameworks addon with pre-compiled binaries of ArrayFire.

arrayfire cuda ofxaddon openframeworks openframeworks-addon

Last synced: 09 May 2026

https://github.com/nikolaydubina/basic-openai-pytorch-server

Minimal HTTP inference server in OpenAI API with Pytorch and CUDA

cuda docker llm openai pytorch server

Last synced: 12 Apr 2026

https://github.com/orgh0/highperformancecnn

Implementation of a High Performance CNN for MNIST dataset

cnn cpp cuda

Last synced: 18 May 2026

https://github.com/aliyoussef97/triton-hub

A container of various PyTorch neural network modules written in Triton.

cuda deep-learning openai pytorch triton triton-lang

Last synced: 30 Mar 2025

https://github.com/ismailtekin05/caloriedetectingai

🍎🔍 Smart AI system that identifies food items in photos and calculates their calorie content automatically. Built with TensorFlow, YOLOv8, CUDA and computer vision for accurate nutrition tracking.

ai aimodel calorie-calculator computer-vision cuda data-analysis data-science data-segmentation data-visualization dataset dataset-generation image-processing image-recognition python segmentation-models tensorflow ultralytics yaml yolo yolov8

Last synced: 29 Apr 2026

https://github.com/fynv/cudainline

A CUDA interface for Python. A distillation of the engine part of ThrustRTC.

cuda gpu nvrtc pyhton

Last synced: 18 May 2026

https://github.com/rhysdg/whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

ai chatbot cuda machine-learning onnxruntime speech-to-text whisper

Last synced: 16 Feb 2026

https://github.com/kartavyaantani/cuda_image_processing

A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line

cuda cuda-kernels cuda-programming cuda-toolkit gpu-programming high-performance-computing image-manipulation image-processing nvidia-cuda nvidia-gpu

Last synced: 30 Apr 2026

https://github.com/hatamiarash7/cuda-python

GPU programming using CUDA & Python

cuda gpu gpu-computing gpu-programming python

Last synced: 29 Apr 2026

https://github.com/SanaeProject/Matrix-for-Cpp

This repository has types that handle matrices.

cpp14 cpp14-library cuda matrix-library

Last synced: 15 May 2025

https://github.com/xihuai18/image-processing-in-cuda

Implementation of Image Processing Method

cuda imageprocessing

Last synced: 04 Oct 2025

https://github.com/han-minhee/sgemm_hip

SGEMM implementations in HIP for NVIDIA / AMD GPUs

cuda gpgpu gpu hip rocm

Last synced: 27 Apr 2026

https://github.com/gvvsnrnaveen/cuda

this repository contains the various programs that can written using CUDA Toolkit.

c cpp cuda nvcc nvidia-cuda nvidia-gpu

Last synced: 17 Jan 2026

https://github.com/fblupi/grado_informatica-ppr

Prácticas de la asignatura Programación Paralela de la UGR

cuda mpi openmp parallel-computing

Last synced: 22 Apr 2026

https://github.com/subatomicplanets/simplebitcoinminer

A simple Bitcoin C++ and CUDA solo miner

bitcoin cpp cryptocurrency cuda miner

Last synced: 19 Apr 2026

https://github.com/makischristou/mandelbrot

Mandelbrot set visualizer using CUDA.

cpp cuda gpu mandelbrot nvidia renderer rust

Last synced: 09 Apr 2026

https://github.com/xlisp/learn-vllm

vllm learning

cuda nvidia pytorch vllm

Last synced: 10 May 2026

https://github.com/exprays/atlas

Atlas is a specialized convolutional neural network designed for satellite image change detection

alembic celery cnn-for-visual-recognition cuda geospatial-visualization python pytorch tensors

Last synced: 28 Feb 2026

https://github.com/chintak/theano-lasagne-docker

Dockerfile for Lasagne with Cuda support. Look at the branches for relevant Dockerfiles - ``cpu`` and ``gpu``.

caffe cuda docker dockerfile install-script lasagne machine-learning machine-learning-library theano

Last synced: 10 Apr 2025

https://github.com/qervas/cn_chess_ai

chinese chess(Xiangqi) AI

ai cpp cuda dqn qt6

Last synced: 13 Feb 2026

https://github.com/arakiss/hecate-os

Linux distro with automatic hardware detection and per-system optimization. Ubuntu 24.04 base. Alpha.

ai cuda docker gpu hardware-optimization kernel-tuning linux linux-distribution machine-learning nvidia operating-system performance sysctl ubuntu workstation zram

Last synced: 16 Feb 2026

https://github.com/willigarneau/sobel-filter-cuda

🖼️ Assignment 1 in Intelligent Industrial System at Cégep Lévis-Lauzon. Learning Cuda and OpenCV by creating a sobel filter. 💻

cplusplus cuda filter opencv sobel

Last synced: 16 Apr 2026

https://github.com/m15kh/cuda_programming

CUDA programming enables parallel computing on NVIDIA GPUs for high-performance tasks like deep learning and scientific computing

cuda cuda-programming gpu nvidia parallel-computing practice-programming

Last synced: 03 Apr 2025

https://github.com/andrewboessen/bitonic-merge-sort

Bitonic Merge Sort algorithm optimized for GPU execution

bitonic-merge-sort cuda sorting-network

Last synced: 16 May 2026

https://github.com/amirbroker/cudadtw

Use CUDA with numba for Dynamic Time Warping

cuda dtw dynamic-time-warping gpu numba

Last synced: 16 Apr 2026

https://github.com/kchristin22/ising_model

Implementation of a cellular automaton on GPU using different features of CUDA

cellular-automaton cuda gpu-programming hpc ising-model parallel-computing

Last synced: 15 Mar 2025

https://github.com/romaingrx/ml-nix-flake

A simple nix flake to start ML env with uv and cuda out of the box

cuda ml nix nix-flake uv

Last synced: 30 Apr 2026

https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code

This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.

biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python

Last synced: 02 May 2026

https://github.com/naidezhujimo/cuda-rewrite-fast-matrix-multiplication

This repository contains an optimized implementation of matrix multiplication using CUDA. The goal of this project is to provide a high-performance solution for matrix multiplication operations on NVIDIA GPUs.

cuda

Last synced: 26 Mar 2025

https://github.com/miniex/maidenx

Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine

ai cuda rust

Last synced: 20 Mar 2025

https://github.com/mayukhdeb/patrick

Tiny neural net library written from scratch with cupy :warning: under construction :warning:

cuda deep-learning gpu-computing machine-learning neural-network regression

Last synced: 08 May 2026

https://github.com/stanczakdominik/cuda_poisson

A 2D poisson solver via CUDA

cuda electromagnetism pde

Last synced: 29 Jun 2025

https://github.com/steleman/openai-triton

Fork of OpenAI's Triton compiler v3.4.0 using LLVM 21.1.0 / 21.1.1 on Fedora 41+

cuda fedora linux llvm mlir mlir-dialect openai rocm triton

Last synced: 08 Apr 2026

https://github.com/gabrielmaialva33/enton

Autonomous AI Robot Assistant — Vision, Voice, and Soul

ai autonomous-agent computer-vision cuda llm python pytorch robot stt tts whisper yolo

Last synced: 01 Apr 2026

https://github.com/andih/cuda-fortran-stream

Variant of STREAM Benchmark in CUDA Fortran

cuda cuda-fortran gpu stream-benchmarks variants

Last synced: 02 Mar 2025

https://github.com/iag-geo/image-classification

Image classification scripts using YOLOv5 with aerial imagery

cuda image-classification python pytorch swimming-pools yolov5

Last synced: 22 Feb 2026

https://github.com/ezroot/gacc

GIACC - Generate Images, Art, Code and Conversations

ai codegen cuda huggingface image imagegeneration python rust stablediffusion

Last synced: 06 Apr 2026

https://github.com/raumberg/hypervision

Neural Network based real-time aimbot system, operating on TensorRT with custom CUDA kernel and C FFI extensions

ai aim cuda cython neural-networks python tensorrt yolo

Last synced: 20 May 2026

https://github.com/matteogianferrari/qr-decomposition

Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.

cuda openmp parallel-computing

Last synced: 12 Apr 2026

https://github.com/greg-tarr/fastsimplex

CUDA/MPS accelerated 2D & 3D simplex noise generation.

cuda mps noise-generator python simplex-noise

Last synced: 20 Apr 2026

https://github.com/hartorn/docker-python

Repository to build python image, based on ubuntu and CUDA

cuda docker mkl-dnn onednn python3 ubuntu ubuntu1804

Last synced: 05 May 2026

https://github.com/enriquebdel/clases-cuda-programacion-paralela-en-c-

En este repositorio encontrarás varias lecciones creadas por mí sobre la librería CUDA en C. El programa que utilizo para programar es MobaXterm.

c cuda cuda-programming gnu-linux googlecolab mobaxterm nvidia parallel-programming ubuntu university

Last synced: 19 May 2026

https://github.com/pratikvn/nla4hpc-exercises-framework

The exercises framework for the Numerical Linear Algebra for HPC course at Karlsruhe Institute of Technology.

cuda ginkgo homeworks hpc-course teaching

Last synced: 19 May 2026

https://github.com/zhangjun/my_notes

Daily stuffs

cuda gpu

Last synced: 17 Apr 2026

https://github.com/yooodleee/hello-cuda

👽Nice to meet you, CUDA!👽

c cc cuda gpgpu multiprocessing

Last synced: 09 Apr 2026

https://github.com/manishklach/thermal-observatory

A generic thermal observability framework for CPU, GPU, board, and platform telemetry across vendor APIs, kernel interfaces, and runtime correlation layers.

amd arm64 cuda linux nvidia nvml observability rocm telemetry thermal-framework thermal-monitoring x86-64

Last synced: 09 Jun 2026

https://github.com/programmer-rd-ai/object-detection-framework

A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.

coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet

Last synced: 24 Sep 2025

https://github.com/pintamonas4575/tfg-diffusion-model-customdataset

Creación en Pytorch de un modelo de difusión para generación incondicional de imágenes con un dataset propio.

attention-mechanism cnn cosine-scheduler cuda custom-dataset ddim deep-learning diffusion-models gpu image-generation pytorch

Last synced: 17 Apr 2026

https://github.com/sbstndb/grayscott_k

A simple 3D GrayScott simulation using Kokkos enabling CUDA or OpenMP backend

cuda finite-difference grayscott grid kokkos laplacian openmp simulation visualisation

Last synced: 16 May 2026

https://github.com/xza85hrf/ml-framework_checker

ML Framework and CUDA Checker is a Python-based GUI application for checking PyTorch, TensorFlow, and CUDA installations. It provides detailed system specs, compatibility checks, advanced GPU management, and offers options to view instructions, export logs, and update machine learning frameworks.

compatibility cuda gpu-management gui-application machine-learning python pytorch system-checker system-specs tensorflow

Last synced: 28 Apr 2026

https://github.com/quantum-integrated-technologies/deepforge

DeepForge : framework for working with machine learning.

ai artificial-intelligence cuda library machine-learning ml neural-network

Last synced: 31 Jul 2025

https://github.com/dhruvsrikanth/fastconv

Distributed and serial implementations of the 2D Convolution operation in c++ and CUDA.

convolution-filters cpp cuda gpu-programming high-performance-computing hpc image-editor image-processing nvidia parallel-programming

Last synced: 04 May 2026

https://github.com/bl33h/productoftwovectors

This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.

cuda gpu kernel paralelism parallel-programming product vector

Last synced: 16 May 2026

https://github.com/sakurabtc888/btc-eth-evm-ltc-trx-collision

针对BTC、ETH(EVM)、LTC、TRX链的私钥、公钥CPU+GPU碰撞工具

btc cuda eth evm ltc trx

Last synced: 04 Jul 2025

https://github.com/straightchlorine/quantum-pipeline

A Python module for executing and monitoring quantum algorithms across local simulators and IBM Quantum platforms. Seamlessly handles data collection, organization, and streaming to Apache Kafka

apache-kafka apache-spark aws-s3 cuda docker gpu-acceleration ibm-cloud ibm-quantum minio qiskit qiskit-aer qiskit-nature quantum-computing visualizations vqe

Last synced: 08 Oct 2025

https://github.com/bhattbhavesh91/rapids-cudf-cuml-example

Running KNN algorithm much faster on GPU for free using RAPIDS packages like cuML and cuDF

cuda cuml deep-learning nvidia-gpu rapids rapidsai

Last synced: 17 Apr 2026

https://github.com/eric900115/parallelprogramming

The repository contains the coursework for CS5422, NTHU's Parallel Programming Course.

cuda mpi openmp ucx

Last synced: 26 May 2026

https://github.com/sd7campeon/yelp-sentiment-analysis-with-python-bs4-and-llm

A scalable pipeline for automated extraction, preprocessing, and sentiment analysis of Yelp reviews. Uses advanced HTTP requests, HTML parsing, and text normalization (tokenization, stopword removal, lemmatization) to enable precise polarity and subjectivity analysis for consumer insights and business analytics.

beautifulsoup beautifulsoup4 business-analytics cuda data-analysis nlp-machine-learning nltk opinion-mining pandas python python3 requests-library-python sentiment-analysis text-preprocessing textblob torch web-scraping yelp-reviews

Last synced: 06 May 2026

https://github.com/m-torhan/cuda-stl-renderer

CUDA C++ implementation of STL file renderer using ray tracing method

cuda

Last synced: 25 Feb 2026

https://github.com/jessetg/cuda-practice

Working through the chapters of Cuda by Example

c cpp cuda cuda-by-example gpgpu

Last synced: 01 May 2026

https://github.com/maawad/ptx_bcht

Bucketed Cuckoo hash set written in PTX and JIT-compiled.

cuckoo cuda gpu hash hashset ptx

Last synced: 01 May 2026

https://github.com/ehsanmok/cs-521

UBC CS 521: Parallel Computing and Architectures

cuda erlang parallel-algorithm parallel-computing

Last synced: 16 May 2026

https://github.com/hdelan/msc-hpc-final-project

In this project I implement a CUDA Lanczos method to approximate the matrix exponential. The matrix exponential is an important centrality measure for large, sparse graphs.

cuda graph-algorithms krylov-methods

Last synced: 12 Apr 2025