An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/chensongpoixs/cmedia_transcode

媒体服务转码版本GPU(cuda) 支持H264与H265转码

cuda gpu h264 h265 media transcode-media

Last synced: 19 May 2026

https://github.com/belrbez/ship-graphic-qt-qml-cuda-c

Client-Server application for Rocket driving in QML graphics

c client-server cpp cuda qml qt5 rocket

Last synced: 08 Apr 2026

https://github.com/kratugautam99/logiclink-project

LogicLink is a conversational AI chatbot developed by Kratu Gautam (AIML Engineer). Powered by the TinyLlama-1.1B-Chat-v1.0 model, it provides an interactive interface for engaging conversations, query resolution, and task assistance. Version 5 features streaming responses, conversation management, and a sleek GUI.

antd-design chatbot-application conversational-ai cuda gradio graphical-user-interface huggingface-spaces huggingface-transformers jupyter-notebooks keras large-language-models mlops model-service-controller modelscope-studio natural-language-generation natural-language-processing pytorch reasoning-agent tensorflow

Last synced: 07 Apr 2026

https://github.com/alexkranias/triton_vs_cuda

Building Triton and CUDA kernels side-by-side to create a cuBLAS-performant GEMM kernel.

cuda cuda-kernels gpu gpu-programming parallel-programming python triton

Last synced: 20 Apr 2026

https://github.com/zhaocc1106/cuda-programming

Learning cuda programming

cuda nvidia

Last synced: 23 Mar 2025

https://github.com/zhaocc1106/cuxx-programing

一些cuda库的样例,cuda、cublas、cublaslt、cusparse...

cublas cublaslt cuda cusparse

Last synced: 23 Mar 2025

https://github.com/td99/ai-sandbox

A collection of AI tools and prototypes.

ai cuda docker image-generation-ai nvidia python

Last synced: 08 Apr 2026

https://github.com/neel-dandiwala/npp_cudaatscale_project

For the enterprise course project, I have created a model that executes the histogram equalisation procedure on the given input image file.

cuda npp

Last synced: 30 Apr 2026

https://github.com/xza85hrf/flux_pipeline

FluxPipeline is a prototype experimental project that provides a framework for working with the FLUX.1-schnell image generation model. This project is intended for educational and experimental purposes only.

ai cuda docker educational experimental flux1 flux1-schnell flux1ai gradio image-generation model non-commercial python pytorch research transformer-model

Last synced: 05 Jul 2025

https://github.com/doxakis/cosinesimilaritydistancesongpu

Compute cosine similarity distances for all combinations of the dataset on the gpu with CUDA

cuda

Last synced: 13 Apr 2026

https://github.com/i-m-iron-man/abmax

Abmax is an agent-based modelling framework in Jax, focused on dynamic population size

abm agent agent-based agent-based-modeling agent-based-simulation agents cuda jax python

Last synced: 04 Oct 2025

https://github.com/9prady9/archdock

Arch linux docker image for app development

arch-linux arrayfire cuda docker-image forge opencl

Last synced: 03 May 2026

https://github.com/jusqua/dip-benchmark

Departmental undergraduate research project at UFS. Digital image processing benchmark using multiple tools to learn new ways to develop image processors.

benchmark cuda image-processing matlab opencv sycl visiongl

Last synced: 20 Apr 2026

https://github.com/zyn10/cuda_code

cude practice

cuda cuda-programming

Last synced: 22 Jun 2025

https://github.com/kar-dim/cas-2d

Implementation of the AMD FidelityFX CAS (Contrast Adaptive Sharpening) algorithm on CUDA/OpenCL, for sharpening static images.

cpp cuda dll fidelityfx gpu image-processing parallel-computing sharpen

Last synced: 22 Jun 2025

https://github.com/cs550-epfl/review

Review of the paper A Formal Analysis of the NVIDIA PTX Memory Consistency Model

cuda formal-verification gpu memory-consistency ptx simt

Last synced: 30 Mar 2025

https://github.com/eyelor/text-to-image-item-generator

A Python workflow for generating random item images using models from Hugging Face.

ai conda cuda flux-schnell generator huggingface item llama python pytorch text-to-image

Last synced: 13 Apr 2026

https://github.com/bonevbs/cuknn

Cuda implementation of k-nearest neighbor search

cuda knn-search

Last synced: 20 Apr 2026

https://github.com/py-sandy/llama.cpp-windows-builder

Automated, reproducible build scripts for llama.cpp on Windows 10/11. Installs prerequisites, configures CMake and builds with CUDA.

ai build-scripts build-tool builder cuda llamacpp script scripts windows windows-10 windows-11

Last synced: 20 Apr 2026

https://github.com/kanchishimono/python-images

Ubuntu based Python container images, including CUDA images

container-image cuda docker dockerfile machine-learning python python3

Last synced: 30 Apr 2026

https://github.com/rkarahul/person-detector-faceverifier

Person-Detector-FaceVerifier is a sophisticated system for detecting and verifying faces in images. Ideal for applications like passport control and security, it combines advanced face detection with precise verification techniques.

bootstrap5 css3 cuda django html5 javascipt opencv-python os python pytorch yolov8

Last synced: 07 Apr 2026

https://github.com/ribin-baby/cuda_cudnn_installation_on_ubuntu20.04

Installation of CUDA-11.8 with cuDNN-8.7 for ubuntu(20.04) server A30 GPU, and onnx gpu installation guide

cuda gpu linux onnxruntime server

Last synced: 16 May 2026

https://github.com/fedesky25/hpc-project-2024

Project for the 2024 course of HPC: generator of streamplot of complex-valued functions

complex-numbers cuda openmp

Last synced: 30 Mar 2025

https://github.com/puzzlef/vector-multiplication-cuda

Comparing approaches for CUDA-based vector multiplication.

algorithm cuda map multiply operation pagerank primitive

Last synced: 30 Apr 2026

https://github.com/abhiram-kandiyana/cuda-blast-2024

Reimplementation of NCBI BLAST with CUDA backend for faster retrieval

blast cuda gpu-acceleration parallel-processing

Last synced: 15 Mar 2025

https://github.com/mvishiu11/kmeans-clustering

K-Means Clustering with both GPU (CUDA) and CPU implementations

cuda kmeans-clustering

Last synced: 15 Mar 2025

https://github.com/anne-andresen/autoencoder_3d_c_cuda

3D Autoencoder training in raw C/CUDA

3d autoencoder c cuda nifti

Last synced: 28 Apr 2026

https://github.com/mahshid1378/piper-plus-3

Multilingual neural TTS (6 languages: JA/EN/ZH/ES/FR/PT, code supports SV) — C++, C#, Rust, Go, Python, npm (WASM). VITS + Prosody, streaming, CUDA/CoreML/DirectML. pip install piper-plus | npm install piper-plus | cargo install piper-plus-cli

cross-platform csharp cuda deep-learning dotnet japanese multilingual nuget onnx pytorch rust speech-synthesis streaming text-to-speech tts vits webassembly

Last synced: 08 Jun 2026

https://github.com/daelsepara/hipnewton

GPU Implementation of Newton Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip newton parallel-computing rocm sdk

Last synced: 03 May 2026

https://github.com/sandialabs/tenzing

Core library for optimizing CUDA+MPI programs as sequential decision problems.

cuda mpi scr-2759 sequential-decision-problem

Last synced: 29 Apr 2026

https://github.com/snandasena/cuda-at-scale-for-the-enterprise

Gauss Filter with CUDA and NPP

cpp cuda gpu nvidia

Last synced: 29 Apr 2026

https://github.com/efecaliskannn/pneumonia-detection-with-cnn--vgg16--and-resnet50-deep-learning-models

In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)

artificial-intelligence convolutional-neural-networks cuda deep-learning keras-tensorflow nvidia-cuda pyhton transfer-learning

Last synced: 13 Jun 2025

https://github.com/hshindo/libcuda.jl

CUDA GPU array for Julia

cuda gpu julia

Last synced: 16 May 2026

https://github.com/flosmume/cpp-cuda-deepvision-rtx-starter

CUDA C++ practice project for RTX 4070 SUPER — explore GPU concurrency, pinned memory, and Nsight profiling. Includes SAXPY and 2D blur kernels to train optimization, stream overlap, and timing analysis for NVIDIA Developer Technology Engineering skillset.

cpp cuda cuda-kernels cuda-streams deep-learning-inference gpu gpu-optimization gpu-profiling high-performance-computing nsight nvidia parrallel-computing pinned-memory

Last synced: 16 May 2026

https://github.com/dasbd72/nthu-ipc-2022

National Tsing Hua University - Introduction to Parallel Computing - 2022

cuda cuda-programming hpc mpi openmp pthreads

Last synced: 30 Mar 2025

https://github.com/mrkct/cuda-raytracer

Simple CUDA-Accelerated raytracer

cuda gpu raytracing raytracing-one-weekend

Last synced: 21 Apr 2026

https://github.com/ahmadrafidev/learn-cuda

A place where I learn about CUDA

cuda cuda-programming gpu os parallel-programming

Last synced: 13 Apr 2025

https://github.com/rai-project/dlperf

Déjà vu: Modeling DNN Performance by Recalling History

benchmark cuda deep-learning modeling onnx performance tensorflow

Last synced: 21 Apr 2026

https://github.com/musaibbashir/object-detection

Pytorch+CUDA implementation of several image classification and object detection models like YOLO , Fast-CNN, RF-DETR

cnn computer-vision cuda image-classification object-detection pytorch yolo

Last synced: 21 Apr 2026

https://github.com/grindelfp/cuda-n-body-simulation

Simulation of N-Body movement using CUDA.

cuda n-body-simulation

Last synced: 06 Apr 2025

https://github.com/bjornmelin/ml-algorithm-playground

🧪 Core ML algorithm implementations with GPU acceleration. Featuring optimized implementations across various libraries with comprehensive analysis. 📈

algorithms cuda gpu-computing lightgbm machine-learning python scikit-learn xgboost

Last synced: 13 May 2026

https://github.com/mrgkanev/tensorflow-gpu-docker-setup

A Docker environment for TensorFlow GPU development with optimized configurations for WSL2, troubleshooting guides, and common error fixes

cuda cuda-toolkit deep-learning dev-environment development-tools docker gpu-acceleration machine-learning nvidia-docker nvidia-docker-support python tensorflow

Last synced: 13 Apr 2026

https://github.com/actepukc/uv-app-starter-pack

Bootstrap PySide6 GUI apps quickly using uv, with built-in PyTorch/CUDA handling.

astral-uv cross-platform cuda gui pyside6 python pytorch qt6 starter-kit template

Last synced: 30 Apr 2026

https://github.com/hrshl212/custom-cuda-kernels-with-neural-network-implementation

The repository contains custom CUDA kernels for linear layer, softmax and relu which are integrated with python to develop a Neural Network

cuda neural-network python pytorch

Last synced: 08 May 2026

https://github.com/parxd/cuda-optim

optimizing CUDA kernels

cuda machine-learning

Last synced: 26 Mar 2025

https://github.com/jiaau/kernels

This repository showcases common optimization techniques for kernels.

cpp cuda cute cutlass hpc kernel

Last synced: 21 Apr 2026

https://github.com/shermanlo77/poisson_icing

Gibbs sampling on the Poisson-Ising model. The Poisson-Ising model is a 2D image of Poisson distributed random variables but has a dependency on their four neighbours. This causes the Poisson random variables to be similar (or dissimilar) to their neighbours.

cuda cupy gibbs-sampling gpu ising-model mcmc monte-carlo poisson poisson-ising

Last synced: 21 May 2026

https://github.com/sbstndb/nbody_k

A simple 3D naïve NBody simulation using Kokkos enabling CUDA or OpenMP backend

cuda kokkos nbody openmp simulation

Last synced: 21 May 2026

https://github.com/sid911/neuralnetworkcpp

A small experiment to learn about neural networks and their runtimes in cpp

cpp cuda machine-learning neural-network

Last synced: 20 Aug 2025

https://github.com/lord-turmoil/cudacmakedemo

A demo for building CUDA program with CMake

cuda tutorial

Last synced: 16 Mar 2025

https://github.com/delusionary/histoptimizer

Solves a minimum variance cost of the partition problem.

cuda numba python

Last synced: 14 Jan 2026

https://github.com/dgcnz/nvtx-vscode

Create NVIDIA NVTX ranges directly in VS Code, then profile with Nsight Systems without modifying source code.

cuda nvtx pytorch vscode

Last synced: 13 Apr 2026

https://github.com/ran-2012/cuda-practice

cuda practice code for nvidia programming guide

cuda

Last synced: 27 Feb 2025

https://github.com/fxzxmicah/fedora-llama-cpp

llama.cpp tools with OpenMP, CUDA, and OpenVINO support

cuda fedora llama-cpp openmp openvino rpm

Last synced: 05 Jun 2026

https://github.com/avicted/hip_fm_synthesis

This project demonstrates FM Synthesis (Frequency Modulation) using HIP (Heterogeneous Compute Interface), enabling high-performance sound generation on both AMD and NVIDIA GPUs.

amd audio-processing cuda fm-synthesis hip nvidia rocm

Last synced: 16 Mar 2025

https://github.com/nel-s/vein-cracker

Recovers which internal generator states could have generated a provided set of Minecraft Java b1.6-1.12.2 veins. Those can then be used to recover 3/4ths of any worldseeds that could have generated them.

cuda minecraft seedcracking veins

Last synced: 16 Mar 2025