CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-07-01 00:07:09 UTC
- JSON Representation
https://github.com/chensongpoixs/cmedia_transcode
媒体服务转码版本GPU(cuda) 支持H264与H265转码
cuda gpu h264 h265 media transcode-media
Last synced: 19 May 2026
https://github.com/belrbez/ship-graphic-qt-qml-cuda-c
Client-Server application for Rocket driving in QML graphics
c client-server cpp cuda qml qt5 rocket
Last synced: 08 Apr 2026
https://github.com/drilonaliu/parallel-image-edge-detection
cuda edge-detection gpu image-processing
Last synced: 17 May 2026
https://github.com/kratugautam99/logiclink-project
LogicLink is a conversational AI chatbot developed by Kratu Gautam (AIML Engineer). Powered by the TinyLlama-1.1B-Chat-v1.0 model, it provides an interactive interface for engaging conversations, query resolution, and task assistance. Version 5 features streaming responses, conversation management, and a sleek GUI.
antd-design chatbot-application conversational-ai cuda gradio graphical-user-interface huggingface-spaces huggingface-transformers jupyter-notebooks keras large-language-models mlops model-service-controller modelscope-studio natural-language-generation natural-language-processing pytorch reasoning-agent tensorflow
Last synced: 07 Apr 2026
https://github.com/alexkranias/triton_vs_cuda
Building Triton and CUDA kernels side-by-side to create a cuBLAS-performant GEMM kernel.
cuda cuda-kernels gpu gpu-programming parallel-programming python triton
Last synced: 20 Apr 2026
https://github.com/zhaocc1106/cuxx-programing
一些cuda库的样例,cuda、cublas、cublaslt、cusparse...
Last synced: 23 Mar 2025
https://github.com/td99/ai-sandbox
A collection of AI tools and prototypes.
ai cuda docker image-generation-ai nvidia python
Last synced: 08 Apr 2026
https://github.com/neel-dandiwala/npp_cudaatscale_project
For the enterprise course project, I have created a model that executes the histogram equalisation procedure on the given input image file.
Last synced: 30 Apr 2026
https://github.com/eminem5410/devmind-platform
Linux-first CLI for AI environment diagnostics, repair & automation
ai automation cli cuda developer-tools devops docker generative-ai linux local-llm observability ollama python self-hosted system-monitoring
Last synced: 30 May 2026
https://github.com/xza85hrf/flux_pipeline
FluxPipeline is a prototype experimental project that provides a framework for working with the FLUX.1-schnell image generation model. This project is intended for educational and experimental purposes only.
ai cuda docker educational experimental flux1 flux1-schnell flux1ai gradio image-generation model non-commercial python pytorch research transformer-model
Last synced: 05 Jul 2025
https://github.com/doxakis/cosinesimilaritydistancesongpu
Compute cosine similarity distances for all combinations of the dataset on the gpu with CUDA
Last synced: 13 Apr 2026
https://github.com/i-m-iron-man/abmax
Abmax is an agent-based modelling framework in Jax, focused on dynamic population size
abm agent agent-based agent-based-modeling agent-based-simulation agents cuda jax python
Last synced: 04 Oct 2025
https://github.com/9prady9/archdock
Arch linux docker image for app development
arch-linux arrayfire cuda docker-image forge opencl
Last synced: 03 May 2026
https://github.com/jusqua/dip-benchmark
Departmental undergraduate research project at UFS. Digital image processing benchmark using multiple tools to learn new ways to develop image processors.
benchmark cuda image-processing matlab opencv sycl visiongl
Last synced: 20 Apr 2026
https://github.com/kar-dim/cas-2d
Implementation of the AMD FidelityFX CAS (Contrast Adaptive Sharpening) algorithm on CUDA/OpenCL, for sharpening static images.
cpp cuda dll fidelityfx gpu image-processing parallel-computing sharpen
Last synced: 22 Jun 2025
https://github.com/cs550-epfl/review
Review of the paper A Formal Analysis of the NVIDIA PTX Memory Consistency Model
cuda formal-verification gpu memory-consistency ptx simt
Last synced: 30 Mar 2025
https://github.com/eyelor/text-to-image-item-generator
A Python workflow for generating random item images using models from Hugging Face.
ai conda cuda flux-schnell generator huggingface item llama python pytorch text-to-image
Last synced: 13 Apr 2026
https://github.com/bonevbs/cuknn
Cuda implementation of k-nearest neighbor search
Last synced: 20 Apr 2026
https://github.com/py-sandy/llama.cpp-windows-builder
Automated, reproducible build scripts for llama.cpp on Windows 10/11. Installs prerequisites, configures CMake and builds with CUDA.
ai build-scripts build-tool builder cuda llamacpp script scripts windows windows-10 windows-11
Last synced: 20 Apr 2026
https://github.com/kanchishimono/python-images
Ubuntu based Python container images, including CUDA images
container-image cuda docker dockerfile machine-learning python python3
Last synced: 30 Apr 2026
https://github.com/rkarahul/person-detector-faceverifier
Person-Detector-FaceVerifier is a sophisticated system for detecting and verifying faces in images. Ideal for applications like passport control and security, it combines advanced face detection with precise verification techniques.
bootstrap5 css3 cuda django html5 javascipt opencv-python os python pytorch yolov8
Last synced: 07 Apr 2026
https://github.com/ribin-baby/cuda_cudnn_installation_on_ubuntu20.04
Installation of CUDA-11.8 with cuDNN-8.7 for ubuntu(20.04) server A30 GPU, and onnx gpu installation guide
cuda gpu linux onnxruntime server
Last synced: 16 May 2026
https://github.com/fedesky25/hpc-project-2024
Project for the 2024 course of HPC: generator of streamplot of complex-valued functions
Last synced: 30 Mar 2025
https://github.com/abhiram-kandiyana/cuda-blast-2024
Reimplementation of NCBI BLAST with CUDA backend for faster retrieval
blast cuda gpu-acceleration parallel-processing
Last synced: 15 Mar 2025
https://github.com/mvishiu11/kmeans-clustering
K-Means Clustering with both GPU (CUDA) and CPU implementations
Last synced: 15 Mar 2025
https://github.com/anne-andresen/autoencoder_3d_c_cuda
3D Autoencoder training in raw C/CUDA
Last synced: 28 Apr 2026
https://github.com/mahshid1378/piper-plus-3
Multilingual neural TTS (6 languages: JA/EN/ZH/ES/FR/PT, code supports SV) — C++, C#, Rust, Go, Python, npm (WASM). VITS + Prosody, streaming, CUDA/CoreML/DirectML. pip install piper-plus | npm install piper-plus | cargo install piper-plus-cli
cross-platform csharp cuda deep-learning dotnet japanese multilingual nuget onnx pytorch rust speech-synthesis streaming text-to-speech tts vits webassembly
Last synced: 08 Jun 2026
https://github.com/daelsepara/hipnewton
GPU Implementation of Newton Fractal Generator with Benchmarking
amd cuda fractal gpu gpu-compute gpu-computing hip newton parallel-computing rocm sdk
Last synced: 03 May 2026
https://github.com/sandialabs/tenzing
Core library for optimizing CUDA+MPI programs as sequential decision problems.
cuda mpi scr-2759 sequential-decision-problem
Last synced: 29 Apr 2026
https://github.com/snandasena/cuda-at-scale-for-the-enterprise
Gauss Filter with CUDA and NPP
Last synced: 29 Apr 2026
https://github.com/efecaliskannn/pneumonia-detection-with-cnn--vgg16--and-resnet50-deep-learning-models
In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)
artificial-intelligence convolutional-neural-networks cuda deep-learning keras-tensorflow nvidia-cuda pyhton transfer-learning
Last synced: 13 Jun 2025
https://github.com/flosmume/cpp-cuda-deepvision-rtx-starter
CUDA C++ practice project for RTX 4070 SUPER — explore GPU concurrency, pinned memory, and Nsight profiling. Includes SAXPY and 2D blur kernels to train optimization, stream overlap, and timing analysis for NVIDIA Developer Technology Engineering skillset.
cpp cuda cuda-kernels cuda-streams deep-learning-inference gpu gpu-optimization gpu-profiling high-performance-computing nsight nvidia parrallel-computing pinned-memory
Last synced: 16 May 2026
https://github.com/dasbd72/nthu-ipc-2022
National Tsing Hua University - Introduction to Parallel Computing - 2022
cuda cuda-programming hpc mpi openmp pthreads
Last synced: 30 Mar 2025
https://github.com/mrkct/cuda-raytracer
Simple CUDA-Accelerated raytracer
cuda gpu raytracing raytracing-one-weekend
Last synced: 21 Apr 2026
https://github.com/ahmadrafidev/learn-cuda
A place where I learn about CUDA
cuda cuda-programming gpu os parallel-programming
Last synced: 13 Apr 2025
https://github.com/rai-project/dlperf
Déjà vu: Modeling DNN Performance by Recalling History
benchmark cuda deep-learning modeling onnx performance tensorflow
Last synced: 21 Apr 2026
https://github.com/musaibbashir/object-detection
Pytorch+CUDA implementation of several image classification and object detection models like YOLO , Fast-CNN, RF-DETR
cnn computer-vision cuda image-classification object-detection pytorch yolo
Last synced: 21 Apr 2026
https://github.com/grindelfp/cuda-n-body-simulation
Simulation of N-Body movement using CUDA.
Last synced: 06 Apr 2025
https://github.com/bjornmelin/ml-algorithm-playground
🧪 Core ML algorithm implementations with GPU acceleration. Featuring optimized implementations across various libraries with comprehensive analysis. 📈
algorithms cuda gpu-computing lightgbm machine-learning python scikit-learn xgboost
Last synced: 13 May 2026
https://github.com/mrgkanev/tensorflow-gpu-docker-setup
A Docker environment for TensorFlow GPU development with optimized configurations for WSL2, troubleshooting guides, and common error fixes
cuda cuda-toolkit deep-learning dev-environment development-tools docker gpu-acceleration machine-learning nvidia-docker nvidia-docker-support python tensorflow
Last synced: 13 Apr 2026
https://github.com/actepukc/uv-app-starter-pack
Bootstrap PySide6 GUI apps quickly using uv, with built-in PyTorch/CUDA handling.
astral-uv cross-platform cuda gui pyside6 python pytorch qt6 starter-kit template
Last synced: 30 Apr 2026
https://github.com/lu-m-dev/cuda-molecular-simulation
CUDA accelerated molecular simulation of materials
cuda materials-science molecular-dynamics molecular-simulation monte-carlo
Last synced: 25 Jun 2026
https://github.com/hrshl212/custom-cuda-kernels-with-neural-network-implementation
The repository contains custom CUDA kernels for linear layer, softmax and relu which are integrated with python to develop a Neural Network
cuda neural-network python pytorch
Last synced: 08 May 2026
https://github.com/shermanlo77/poisson_icing
Gibbs sampling on the Poisson-Ising model. The Poisson-Ising model is a 2D image of Poisson distributed random variables but has a dependency on their four neighbours. This causes the Poisson random variables to be similar (or dissimilar) to their neighbours.
cuda cupy gibbs-sampling gpu ising-model mcmc monte-carlo poisson poisson-ising
Last synced: 21 May 2026
https://github.com/sbstndb/nbody_k
A simple 3D naïve NBody simulation using Kokkos enabling CUDA or OpenMP backend
cuda kokkos nbody openmp simulation
Last synced: 21 May 2026
https://github.com/bfalls/img-compressor
GPU-accelerated JPEG compressor
cli-tool command-line compression cpp cpp-cuda-gpu-programming-parallel-computing cuda dct demo-project gpgpu gpu-programming high-performance-computing hpc image-compression image-processing jpeg parallel-computing
Last synced: 20 Apr 2026
https://github.com/sid911/neuralnetworkcpp
A small experiment to learn about neural networks and their runtimes in cpp
cpp cuda machine-learning neural-network
Last synced: 20 Aug 2025
https://github.com/lord-turmoil/cudacmakedemo
A demo for building CUDA program with CMake
Last synced: 16 Mar 2025
https://github.com/delusionary/histoptimizer
Solves a minimum variance cost of the partition problem.
Last synced: 14 Jan 2026
https://github.com/dgcnz/nvtx-vscode
Create NVIDIA NVTX ranges directly in VS Code, then profile with Nsight Systems without modifying source code.
Last synced: 13 Apr 2026
https://github.com/ran-2012/cuda-practice
cuda practice code for nvidia programming guide
Last synced: 27 Feb 2025
https://github.com/avicted/hip_fm_synthesis
This project demonstrates FM Synthesis (Frequency Modulation) using HIP (Heterogeneous Compute Interface), enabling high-performance sound generation on both AMD and NVIDIA GPUs.
amd audio-processing cuda fm-synthesis hip nvidia rocm
Last synced: 16 Mar 2025
https://github.com/nel-s/vein-cracker
Recovers which internal generator states could have generated a provided set of Minecraft Java b1.6-1.12.2 veins. Those can then be used to recover 3/4ths of any worldseeds that could have generated them.
cuda minecraft seedcracking veins
Last synced: 16 Mar 2025