An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/separatrixxx/pgp_labs_7_sem

👓 Laboratory work for the 7 semester of MAI on PGP and PDP

cpp cuda nvidia

Last synced: 15 May 2026

https://github.com/bjornmelin/cuda-core-projects

🎯 Essential CUDA programming patterns and optimizations. Showcasing parallel computing expertise through matrix operations, memory management, and advanced kernel implementations. 💻

cpp cuda cuda-kernels gpu-computing high-performance-computing nvidia optimization parallel-computing

Last synced: 12 Apr 2026

https://github.com/m-torhan/advent-of-code

🎄 Solutions for the Advent of Code

advent-of-code advent-of-code-2024 cuda

Last synced: 07 Apr 2025

https://github.com/kylesayrs/pttp

PyTorch Tensor Profiler with fully-supported memory timelines and events

cuda memory profiling pytorch

Last synced: 07 May 2026

https://github.com/sergiomarquezdev/yt-transcriber

🛠️ CLI tool to transcribe YouTube videos using OpenAI Whisper with CUDA acceleration, generate AI summaries (EN/ES) with Gemini, and create LinkedIn/Twitter content. Supports YouTube, Google Drive, and local files.

ai cli cuda gemini python transcription whisper youtube

Last synced: 15 May 2026

https://github.com/kis-balazs/cuda-research

CUDA Research & Code. Course-style structured. Inspiration from @Infatoshi.

cuda

Last synced: 14 May 2025

https://github.com/bjornmelin/ai-system-design

🎨 Large-scale AI system architectures and implementations. Features distributed training systems, multi-GPU pipelines, and efficient resource management. 🏗️

architecture cuda distributed-systems engineering gpu-computing production scalability system-design

Last synced: 23 Jul 2025

https://github.com/lionpsiuc/cflow

A computational model for heat propagation in a cylindrical radiator using both CPU and GPU parallel processing. The simulation uses finite difference methods to model the directional flow of heat through a cylindrical pipe system with specific boundary conditions and cyclic connections between pipe segments.

c cuda parallel-programming

Last synced: 29 May 2026

https://github.com/derek-palmer/dvr-scan-file-organizer

DVR-Scan-Organizer is a Dockerized extension for DVR-Scan, designed to process multiple video files and organize output in a structured format.

cuda dvr dvr-scan multimedia opencv opencv-python python video video-processing

Last synced: 01 May 2026

https://github.com/elymsyr/auv_ws

An open-source simulation and control workspace for an Autonomous Underwater Vehicle (AUV) built on ROS 2 Humble and Gazebo. It features a high-fidelity dynamics model and an advanced AI-based motion controller (FossenNet) that uses a pre-trained LibTorch model to imitate a NL-MPC for real-time, high-performance manoeuvring.

autonomous-vehicles auv control-systems cpp cuda deep-learning gazebo imitation-learning libtorch mpc python robotics ros2 simulation

Last synced: 15 Apr 2026

https://github.com/quik-fe/node-nvidia-smi

Node wrapper around nvidia-smi.

cuda gpu nodejs nvidia nvidia-smi typescript

Last synced: 19 Feb 2026

https://github.com/maneeshsit/pcie

Modify run:ai and other FOSS projects code for use with PCIe card-based AI accelerators for both inference and training

cuda cxl cxl-mem distro exo k3s k8s kestra llamacpp llm-d mpi4py mpio onnxoptimizer opentelemetry-ebpf-profiler paxos-cluster pcie photonics-computing runai visualize vllm

Last synced: 24 Aug 2025

https://github.com/oaslananka/cv_cuda_cpp_sample

This is a sample project demonstrating how to use OpenCV and CUDA in C++ for detecting people in drone footage with YOLO. The project aims to be simple and understandable for those who want to learn how to use OpenCV and CUDA in C++.

computervision cpp cuda opencv

Last synced: 01 May 2026

https://github.com/0xhilsa/tenop

A lightweight & minimalist tensor computation library with CUDA backend

bash c cuda python3 tensor

Last synced: 13 Apr 2026

https://github.com/timvgl/cuxrft

Performs FFT in xarrays using cuda

cuda cupy fft python xarray

Last synced: 07 Jan 2026

https://github.com/sshoecraft/shepherd

An interactive multi-backend LLM runtime with intelligent cache eviction and persistent retrieval-augmented memory.

anthropic cli cpp cuda gemini grok inference kv-cache llama-cpp llm mcp ollama openai openai-server rag smart-evictions tensorrt tool-calling ulimited-context

Last synced: 10 Apr 2026

https://github.com/camille-004/cusprec

🏁 Sparse signal recovery library written in PyCUDA.

cuda ml python signal-processing sparse-recovery

Last synced: 18 Jan 2026

https://github.com/1ytic/cuda-gpu-zoo

Properties of the CUDA devices

cuda gpu

Last synced: 20 Aug 2025

https://github.com/sid911/neuralnetworkcpp

A small experiment to learn about neural networks and their runtimes in cpp

cpp cuda machine-learning neural-network

Last synced: 20 Aug 2025

https://github.com/pvgupta24/parallel-programming

Basic algorithms for parallel programming in CUDA C++, Java and OpenMP

cuda openmp parallel-programming

Last synced: 19 Aug 2025

https://github.com/dmalexx/cuda_check

How can you check if CUDA is available in Tensorflow

cuda python tensorflow

Last synced: 10 Apr 2026

https://github.com/rmeli/cuda-pg

CUDA C++ Playground

cpp cuda gpu

Last synced: 16 Apr 2026

https://github.com/ojeda-e/fokker-planck

Numerical solution of the Fokker-Planck equation in large times using CUDA/C.

cuda fokker-planck-equations

Last synced: 17 Aug 2025

https://github.com/alessiobugetti/integral-image-processing

Implements sequential and parallel integral image computation in C++ and Python, utilizing CUDA for parallel computation on GPU

cuda gpu-acceleration integral-image numba parallel-computing pycuda

Last synced: 24 May 2026

https://github.com/i-m-iron-man/abmax

Abmax is an agent-based modelling framework in Jax, focused on dynamic population size

abm agent agent-based agent-based-modeling agent-based-simulation agents cuda jax python

Last synced: 04 Oct 2025

https://github.com/hrolive/data-analytics-in-the-era-of-large-scale-machine-learning

Slides and other material for the Cyprus NCC training event about "Data analytics in the era of large-scale machine learning".

cuda deep-learning gpu-acceleration gradient-boosting large-language-models machine-learning preprocessing python pytorch

Last synced: 13 Apr 2026

https://github.com/andreeo/parallel-computing-cuda

Programs in terminal applying the parallel programming model with the CUDA arquitecture

c cpp cuda docker lineal-search parallel-computing parallel-reduction rank-sort-algorithm

Last synced: 09 Apr 2026

https://github.com/ivanbuccella/sf2bio

Deep reinforcement learning for de novo drug design: a ReLeaSe method execution on a Docker Environment

cuda deep-learning deep-reinforcement-learning docker docker-compose machine-learning nvidia-cuda nvidia-docker reinforcement-learning release release-method

Last synced: 01 May 2026

https://github.com/nwpu66/cookiekiss-engine

CookieKiss Engine include a render and other small tech related to compute graphic.

compute-graphics cpp cuda opengl vulkan

Last synced: 09 Apr 2026

https://github.com/shineiarakawa/particle-stabilizer

A C++ and CUDA-based program for simulating the motion of particles.

cpp cuda n-body particles

Last synced: 12 May 2026

https://github.com/ibrar-syed/complete_deep-learning-nvidia_gpu-setup-linux

Full setup for a deep learning environment on Ubuntu Linux with CUDA, cuDNN, TensorRT, and TensorFlow GPU. Includes scripts, test code, and environment configuration

ai bash conda cuda cudnn deep-learning environment-setup gcc gpu jupyter linux machine-learning nvidia-cuda nvidia-gpu pytorch setup-script tensorflow tensorrt

Last synced: 09 Apr 2026

https://github.com/matthewfeickert/report-urssi-fellowship-2025

Report on URSSI 2025 Early-Career Fellowship

cuda pixi urssi

Last synced: 17 Jan 2026

https://github.com/equiel-1703/cuhip

Wrapper tool to convert CUDA source code to HIP code and compile it with HIPCC. Useful for learning CUDA programming using AMD devices..

cuda hip

Last synced: 14 May 2026

https://github.com/ray-chew/modified_ch

Density functional theory (DFT) and self-consistent field theory (SCFT) simulation of diblock copolymers

cuda density-functional-theory diblock-copolymer numerical-analysis numerical-methods self-consistent-field-theory

Last synced: 11 May 2026

https://github.com/hr-fahim/transformer-model-optimization

Sample GPT Transformer Model from Scratch.

cuda few-shot-learning transfomers

Last synced: 02 May 2026

https://github.com/danieljvickers/fluid_simulation

An educational example for learning the Navier-Stoke equations. Also included is a C++ and CUDA shared object library, buildable with CMake, for use in your personal projects.

cpp cuda differential-equations navier-stokes numpy physics python simulation

Last synced: 04 May 2026

https://github.com/timdev-r/cv-ground-truth-extraction

(Dump) Helper for ground truth extraction, movement analytics and silhouette visual demonstration

computer-vision cuda ground-truth intel-realsense pandas python

Last synced: 18 Apr 2026

https://github.com/datasagess/fic

NLP Hackaton \w NN + FastAPI + Docker

catboost cuda docker fastapi lstm python pytorch rapidfuzz tensorflow

Last synced: 08 Aug 2025

https://github.com/notargets/gocca

Go bindings for OCCA - Portable parallel programming framework

bindings cfd cgo cuda golang gpu hpc occa opencl parallel-computing

Last synced: 20 Jan 2026

https://github.com/dmitryyurov/bitonic-cuda

An implementation of bitonic search on CUDA

cuda gpu-programming sorting-algorithms

Last synced: 02 Oct 2025

https://github.com/conan-kiln/kiln

An actively maintained fork of ConanCenter with an emphasis on CV, ML and robotics capabilities on edge devices

computer-vision conan cuda machine-learning oneapi packaging robotics rust scientific-computing

Last synced: 02 Oct 2025

https://github.com/brave-tarnished/gpu-accelerated-opc

Optical Proximity Correction (OPC) is a photolithography technique that modifies photomask geometry to counteract diffraction and process effects, ensuring accurate printing of patterns on the wafer. This work demonstrates a proof of concept showing how using a GPU-based approach can significantly speed up these modifications compared to a CPU.

cpp cuda gpu-acceleration photolithography semiconductors

Last synced: 02 Oct 2025

https://github.com/sankeer28/pptx-text-audio-transcriber

Extract text and transcribe audio from PowerPoint presentations using OpenAI Whisper.

audio-transcription cuda openai-whisper powerpoint pptx-parser

Last synced: 02 Oct 2025

https://github.com/mrtejas/cv-sandbox

A collection of Computer Vision mini-projects tuned for a number of tasks, including face detection, object detection, image segmentation and CLIP. Trained on popular datasets and includes comparative study of the methods. Done as a part of S24 course : Computer Vision at IIIT Hyd

computer-vision cuda ml opencv pytorch yolo

Last synced: 01 May 2026

https://github.com/xza85hrf/flux_pipeline

FluxPipeline is a prototype experimental project that provides a framework for working with the FLUX.1-schnell image generation model. This project is intended for educational and experimental purposes only.

ai cuda docker educational experimental flux1 flux1-schnell flux1ai gradio image-generation model non-commercial python pytorch research transformer-model

Last synced: 05 Jul 2025

https://github.com/doxakis/cosinesimilaritydistancesongpu

Compute cosine similarity distances for all combinations of the dataset on the gpu with CUDA

cuda

Last synced: 13 Apr 2026

https://github.com/desmondjs/cuda_mceliece_kem

CUDA-Accelerated McEliece KEM 🔑 | Post-Quantum Cryptography on GPU Implementation of Classic McEliece key encapsulation, encryption, decryption, and decapsulation on CPU & GPU with CUDA, including benchmarking scripts and full FYP2 report

academic-project benchmarking classic-mceliece cuda fyp gpu-acceleration kem pqc

Last synced: 02 Oct 2025

https://github.com/illagrenan/cuda-80-cudnn6-runtime-1604-py36

Ubuntu 16.04 with Python 3.6 and CUDA Dockerfile

cuda dockerfile ubuntu

Last synced: 22 Jun 2025

https://github.com/nvaranki/cmmx

CUDA matrix multiplication (official guide, modified)

cuda cuda-kernels

Last synced: 08 Aug 2025

https://github.com/f-koehler/itesol

WIP: Iterative eigensolvers for C++20, Python and CUDA

cpp20 cuda eigenvalues linear-algebra python

Last synced: 08 Nov 2025

https://github.com/eyelor/text-to-image-item-generator

A Python workflow for generating random item images using models from Hugging Face.

ai conda cuda flux-schnell generator huggingface item llama python pytorch text-to-image

Last synced: 13 Apr 2026

https://github.com/tzervas/unsloth-rs

Memory-optimized GPU kernels for LLM fine-tuning in Rust (2-5x speedup, 70-80% less VRAM)

cuda gpu machine-learning optimization rust

Last synced: 25 Jan 2026

https://github.com/cerit-sc/scipion-docker

Scipion (Cryo em image processing framework (https://scipion.i2pc.es/)) adapted to run in Kubernetes.

cryo-em cryoem cuda desktop kubernetes scipion vnc

Last synced: 02 Aug 2025

https://github.com/empenoso/doorcam-face-report

Пример проекта по распознаванию лиц с CUDA-ускорением. Включает скрипты для автоматической сборки dlib и анализа видео на GPU

cuda dlib dlib-face-detection

Last synced: 19 May 2026

https://github.com/tornikeo/sample-openmp-in-cuda

Sample of using OpenMP and CUDA: single GPU, multiple CPU

cuda meson openmp

Last synced: 01 Aug 2025

https://github.com/fikri-rouzan/cuda-c-program-part-3

CUDA C program from NVIDIA course.

c cuda

Last synced: 01 May 2026

https://github.com/9prady9/imageconvolve

Qt app for previewing Image convolution. Uses CUDA for convolution.

c-plus-plus convolution cuda desktop-app qt

Last synced: 03 May 2026

https://github.com/darshanakgr/meanfiltergpu

A gpu implementation of mean filter in CUDA

c cuda image-processing

Last synced: 01 May 2026

https://github.com/aledinola/ifp_cuda_mex

Solve the income fluctuation problem on the GPU

cuda gpu-computing matlab mex

Last synced: 14 May 2026

https://github.com/hrolive/fundamentals-of-accelerated-computing-with-cuda-python

Explore how to use Numba—the just-in-time, type-specializing Python function compiler—to create and launch CUDA kernels to accelerate Python programs on massively parallel NVIDIA GPUs.

accelerated-computing cuda cuda-programming jit numba nvidia python

Last synced: 01 May 2026

https://github.com/efecaliskannn/pneumonia-detection-with-cnn--vgg16--and-resnet50-deep-learning-models

In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)

artificial-intelligence convolutional-neural-networks cuda deep-learning keras-tensorflow nvidia-cuda pyhton transfer-learning

Last synced: 13 Jun 2025

https://github.com/jarmak-personal/vibespatial

GPU-first spatial analytics for Python. Drop-in GeoPandas replacement powered by runtime-compiled CUDA kernels

cccl cuda geodataframe geopandas geospatial gpu gpu-computing nvrtc python spatial-analytics

Last synced: 21 Apr 2026

https://github.com/shambac/shamboflow

Fierce tensorflow competitor

cuda cupy machine-learning numpy pypi-package

Last synced: 19 Feb 2026

https://github.com/ronaldsg20/compu-paralela

Códigos de ejemplo para computación paralela y distribuida

cuda opencv openmp posix-threads

Last synced: 14 May 2026

https://github.com/luis-kr/depthmap

Depth map estimation tool using Depth-Anything-V2. Generate accurate depth maps from images with support for both relative and metric depth measurements.

cuda depth-anything depth-estimation depth-map image-processing python pytorch

Last synced: 08 Feb 2026

https://github.com/macaycz/nn

A lightweight, GPU-accelerated machine learning library built with CUDA.

cuda deep-learning gpu machine-learning neural-network

Last synced: 25 Jul 2025

https://github.com/malolm/football-player-detection-with-yolov8

Football player detection YOLOv8 fine-tuning

cuda jupyterlab python3 yolov8-detection

Last synced: 07 May 2026

https://github.com/mrgkanev/tensorflow-gpu-docker-setup

A Docker environment for TensorFlow GPU development with optimized configurations for WSL2, troubleshooting guides, and common error fixes

cuda cuda-toolkit deep-learning dev-environment development-tools docker gpu-acceleration machine-learning nvidia-docker nvidia-docker-support python tensorflow

Last synced: 13 Apr 2026

https://github.com/faresargus/artaxerxes

Adaptive high-performance stress tester "artaxerxes" supports GPU, io_uring, DPDK, and eBPF/XDP for advanced cybersecurity labs. Ideal for network testing. 🚀🛠️

cuda cuda-programming cybersecurity cybersecurity-education cybersecurity-tools dpdk ebpf educational github-config high-performance network-security network-security-tool penetration-testing penetration-testing-framework penetration-testing-tools stress-testing

Last synced: 24 Jul 2025

https://github.com/hrshl212/custom-cuda-kernels-with-neural-network-implementation

The repository contains custom CUDA kernels for linear layer, softmax and relu which are integrated with python to develop a Neural Network

cuda neural-network python pytorch

Last synced: 08 May 2026

https://github.com/fanziyang-v/parallel-computing

Parallel Computing course materials from Harbin Institute of Technology(Shenzhen).

cuda openmp openmpi parallel-computing

Last synced: 27 Mar 2025

https://github.com/parxd/cuda-optim

optimizing CUDA kernels

cuda machine-learning

Last synced: 26 Mar 2025

https://github.com/drc0ns0le/rtxvideoprocessor

CLI tool to apply NVIDIA RTX VSR and TrueHDR processing to video files

cuda ffmpeg hdr nvidia rtx upscale

Last synced: 20 Apr 2026

https://github.com/lttofu/cosmic

Fast, lightweight GUI-based C++ Ethereum ERC918 token miner for Win64 | CUDA GPUs | CPUs | Pool | Solo Mining

0xbitcoin 0xbtc cplusplus cplusplus-cli cpuminer cuda erc20 erc918 ethereum ethereum-token gpuminer gui pool-mining solo-mining windows windows-10 windows-7 windows-gui winforms

Last synced: 08 Apr 2026

https://github.com/awikramanayake/optimized-matrix-mult

Optimizing matrix multiplication using parallelism and SIMD (AVX2, CUDA)

avx2 cuda matrix-multiplication

Last synced: 22 May 2026

https://github.com/andresvalle/ocr-extraction

Text extraction from images using EasyOCR and parallelization with PyTorch

cuda ocr pytorch

Last synced: 01 May 2026

https://github.com/lord-turmoil/cudacmakedemo

A demo for building CUDA program with CMake

cuda tutorial

Last synced: 16 Mar 2025

https://github.com/delusionary/histoptimizer

Solves a minimum variance cost of the partition problem.

cuda numba python

Last synced: 14 Jan 2026

https://github.com/dgcnz/nvtx-vscode

Create NVIDIA NVTX ranges directly in VS Code, then profile with Nsight Systems without modifying source code.

cuda nvtx pytorch vscode

Last synced: 13 Apr 2026

https://github.com/ran-2012/cuda-practice

cuda practice code for nvidia programming guide

cuda

Last synced: 27 Feb 2025

https://github.com/avicted/hip_fm_synthesis

This project demonstrates FM Synthesis (Frequency Modulation) using HIP (Heterogeneous Compute Interface), enabling high-performance sound generation on both AMD and NVIDIA GPUs.

amd audio-processing cuda fm-synthesis hip nvidia rocm

Last synced: 16 Mar 2025

https://github.com/nel-s/vein-cracker

Recovers which internal generator states could have generated a provided set of Minecraft Java b1.6-1.12.2 veins. Those can then be used to recover 3/4ths of any worldseeds that could have generated them.

cuda minecraft seedcracking veins

Last synced: 16 Mar 2025

https://github.com/curiousci/wind

Multicore Systems Programming project

cuda mpi openmp pthreads

Last synced: 25 Dec 2025

https://github.com/arya2004/parallel-computing

Parallel Computing Uni Course

cuda

Last synced: 18 May 2026

https://github.com/cripterhack/business-address-scrapper

Python+Scrapy - Distributed scraping system with cache for business information extraction.

cuda ollama postgresql python redis scraper scraping scrapy tesseract

Last synced: 14 Jun 2025

https://github.com/bergolho/sycl

Repository with simple programs to learn SYCL.

cpp cuda sycl

Last synced: 16 May 2026

https://github.com/thesupercd/cuda_sort

A simple project implementing and measuring the runtime performance metrics related to massively parallel algorithms (radix sort) on an NVIDIA GPU device.

benchmarking c cpp cuda cuda-programming gpu-acceleration gpu-programming multithreading parallel-processing radix-sort sorting-algorithms

Last synced: 10 May 2026

https://github.com/grindelfp/cuda-texture-memory

Exercise on using texture memory in CUDA.

cuda texture-memory

Last synced: 30 Mar 2025

https://github.com/shermanlo77/oxwasp_phd

Code for the PhD thesis. The topic was on defect detection of 3D printing using x-rays. The repository includes an implementation of the mode filter and empirical null filter.

3d-printing applied-statistics computational-statistics cuda empirical-null imagej mode-filter statistics xray-projection

Last synced: 27 Mar 2025

https://github.com/thesoenke/deeplearning-docker

Setup for Deep Learning experiments in Docker with Cuda

cuda docker fastai jupyter

Last synced: 11 May 2026

https://github.com/dragonscypher/prompty

Tool for generating smart and secure prompts for language models!

autotokenizer bert-model cuda google-t5 llm python3 tensorflow threading

Last synced: 02 Jan 2026

https://github.com/rugleb/cuda

A simple example of a program that uses parallel GPU computing on an NVIDIA graphics card using CUDA technology

cuda gpu nvidia

Last synced: 10 Apr 2025