An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/zelosleone/audiobook-generator

A GPU-accelerated Python application that converts PDF and TXT documents into high-quality MP4 audio files using WhisperSpeech technology.

ai-audio audiobook cuda gpu-acceleration machine-learning pdf-converter python pytorch speech-synthesis text-processing text-to-speech

Last synced: 05 May 2026

https://github.com/hurbalurba/quick-llama.cpp-server

The framework for posting a more modern cuda image for llama.cpp with cuda13 for just newer cards with RPC support. Started as just learning how to compile llama.cpp custom.

cuda cuda13 devops docker dockerbuild gguf llamacpp llm rpc

Last synced: 05 May 2026

https://github.com/xaionaro/cufft-grpc

Export cuFFT through gRPC

cmake cuda cufft fft fourier go golang gpu grpc transformation

Last synced: 05 May 2026

https://github.com/j89103138/yolov11-traffic-sign

This repository contains a YOLOv11 project for training, detection, and benchmarking of traffic signs. The project utilizes CUDA acceleration to enhance performance and efficiency in real-time traffic sign detection and evaluation.

cuda opencv python pytorch traffic traffic-sign yolov11

Last synced: 05 May 2026

https://github.com/barrrry1/claymore-s-dual-miner

Claymore's Dual Miner is a powerful GPU mining software designed for Ethereum (ETH) and simultaneous dual mining of coins like Decred, Siacoin, Pascal, and Lbry. It supports AMD and NVIDIA GPUs, leveraging OpenCL and CUDA optimization for maximum hashrate. Features include automatic GPU tuning, detailed statistics, and stability watchdog.

blockchain crypto-mining cryptocurrency cuda eth ethereum gpu-mining mining mining-pool opencl

Last synced: 05 May 2026

https://github.com/jakubfr4czek/concurrent-gauss-elimination

Concurrent gaussian elimination algorithm implemented using traces theory. Parallelism has been achieved employing CUDA cores.

agh agh-ust agh-wi conda cuda cuda-kernels cuda-toolkit diekert-graph graphviz java python python3 traces-theory

Last synced: 05 May 2026

https://github.com/abdelrahman-amen/active_learning_with_different_query_strategies

This project explores the implementation of active learning techniques, focusing on various query strategies to optimize the selection of informative data points for model training. It aims to reduce the amount of labeled data required while improving model performance, especially in scenarios with limited labeled data.

activelearning cuda entropy kldivergence margin numpy python pyto uncertainty

Last synced: 06 May 2026

https://github.com/insanelywicked1/literate-dollop

A fully automated PowerShell script to compile PyTorch from source with CUDA 12.1 support for NVIDIA RTX 50-series GPUs, optimized for Windows 11.

blackwell cuda gpu-build pytorch rtx5080 rtx5090 windows

Last synced: 06 May 2026

https://github.com/hritiksauw199/human-face-to-cartoon-conversion-using-optimized-cyclegan

Transform real human faces into cartoon-style images using a reduced CycleGAN architecture optimized for efficiency and quality.

cuda cyclegan data-science deep-learning deep-neural-networks gan human-cartoon matplotlib neural-network python pytorch torchvision

Last synced: 06 May 2026

https://github.com/iglee/jax-cuda-eicl-exp-docker

Docker for getting jax to work with cuda, for reproducing ml experiments like eicl. Sure, let's NOT make a compatibility matrix and let people fight for their lives on cuda

cuda docker jax jaxline ml-engineering ml-experiments tensorflow

Last synced: 06 May 2026

https://github.com/smilu97/system-hyu

한양대 시스템 프로그래밍 과제 제출용 레포지터리

c cuda linux matrix

Last synced: 06 May 2026

https://github.com/raiszo/cs334

Journey thorugh Intro to Parallel Programming

cmake cs334 cuda msbuild

Last synced: 06 May 2026

https://github.com/r00tens/text-classifier

Naive Bayes classifier for text classification with CPU and GPU (CUDA)

classification classifier cpp cuda machine-learning naive-bayes

Last synced: 06 May 2026

https://github.com/iamfaham/model-inference-profiler

A PyTorch-based tool for profiling deep learning model inference performance, analyzing computational bottlenecks, and visualizing resource utilization.

cuda memory pytorch visualizations

Last synced: 06 May 2026

https://github.com/rosnavigator/parallelkmeansimagecompressor

Parallel KMeans-based image quantization compressor that reduces the number of colors in an image while preserving visual quality. It uses KMeans clustering for color quantization and supports sequential, OpenMP, MPI, and CUDA implementations for performance and scalability. PoliMi - Advanced Methods for Scientific Computing (2023-2024)

boost clustering colors compression cuda image-quantization kmeans kmeans-clustering lossy-compression mpi odette opencv openmp parallel-computing parallel-programming performance polimi scalability sl-train

Last synced: 06 May 2026

https://github.com/jamesnulliu/learning-programming-massively-parallel-processors

Leaning notes of Programming Massively Parallel Processors, 4-th edition.

cuda notes pytorch

Last synced: 06 May 2026

https://github.com/mka-codelake/wispy

Minimalist push-to-talk dictation tool for Windows. Faster Whisper, local, offline.

cuda dictation faster-whisper local offline portable push-to-talk python speech-to-text stt transcription voice-input whisper windows

Last synced: 06 May 2026

https://github.com/sebp/vscode-sycl-dpcpp-cuda

Sample project to use the VS Code Remote - Containers extension to develop SYCL applications for NVIDIA GPUs using the oneAPI DPC++ compiler.

cuda dpcpp fedora gpu-computing podman sycl vscode

Last synced: 06 May 2026

https://github.com/jpuigcerver/prob-phoc

Probabilistic relevance scores from PHOC embeddings

cuda keyword-spotting kws phoc pytorch

Last synced: 07 May 2026

https://github.com/drilonaliu/parallel-sierpinski-triangle

GPU-accelerated Sierpinski Triangle generation with CUDA and OpenGL interoperability.

cuda fractals gpu parallel-programming sierpinski-triangle

Last synced: 07 May 2026

https://github.com/yuuuuurei/yolo-sibi

Real-time SIBI hand gesture detection using YOLOv8 and deep learning classifiers.

bahasa-indonesia bahasa-isyarat cuda deep-learning hand-gesture hand-gesture-recognition pytorch real-time sibi sign-language yolo yolov8

Last synced: 07 May 2026

https://github.com/drilonaliu/parallel-koch-snowflake

GPU-accelerated Koch Snowflake generation with CUDA and OpenGL interoperability.

cuda fractals gpu koch-snowflake parallel-programming

Last synced: 07 May 2026

https://github.com/noorkhokhar99/how-to-setup-nvidia-gpu-for-object-detection-installing-cuda-toolkit-and-cudnn

How to Setup NVIDIA GPU For object detection | Installing Cuda Toolkit And cuDNN

computer cuda nividia opencv python roboflow vision

Last synced: 07 May 2026

https://github.com/muhamadajiw/parallel-matrix-inversion

A parallel program for matrix inversion using MPI, OpenMP, and CUDA

cpp cuda mpi openmp

Last synced: 07 May 2026

https://github.com/shreya888/learning-cuda-with-cpp-and-pytorch

My notes, code, & insights will be recorded here while learning CUDA with C++ and PyTorch

cpp cuda pytorch

Last synced: 07 May 2026

https://github.com/stevenchang5/canny_edge

Implementation of canny edge detection, with option to use cuda to improve performance

cuda edge-detection opencv

Last synced: 07 May 2026

https://github.com/rssr25/cuda

Following Cuda By Example book.

cpp cuda cuda-programming hpc shaders

Last synced: 07 May 2026

https://github.com/wpjunior/cuda-numba-playground

Some uses of cuda with numba framework

cuda numba python

Last synced: 07 May 2026

https://github.com/not-ml/ml-3

A PyTorch-based Convolutional Neural Network (CNN) for image classification using the CIFAR-10 dataset, featuring advanced architecture, data augmentation, GPU support, and dynamic learning rate scheduling.

ai cifar10 cnn cuda gpu image-classification machine-learning modeltraining python pytorch torchvision

Last synced: 08 May 2026

https://github.com/jimmygizmo/tensorpup

Machine-learning model training using parallelization strategies on multiple serverless GPU instances.

ai cuda cudnn distributed gpu serverless tensorflow

Last synced: 08 May 2026

https://github.com/popke523/rybki

A 3D shoal of fish animation using the boids algorithm, OpenGL for rendering and CUDA for parallel processing.

boids cuda opengl

Last synced: 08 May 2026

https://github.com/sydney-informatics-hub/computer-vision-fine-tuning

Fine tune a computer vision to solve your task locally, on HPC, in a container, or in the cloud!

computer-vision cuda deep-learning python

Last synced: 09 May 2026

https://github.com/sugarcane-mk/finetuning_wav2vec2

This repo provides step by step process from sctatch to fine tune facebook's wav2vec2-large model using transformers

asr asr-model cuda facebook fairseq fine-tuning finetuning huggingface librosa python torch transformers wav2vec2 wav2vec2-large-960h

Last synced: 09 May 2026

https://github.com/ginkobalboa/parfis

Particles and field simulator. Written in C++ with Python bindings. The algorithm is based on the particle-in-cell (PIC) method used for interacting many-particle systems.

cpp cuda physics-simulation python

Last synced: 09 May 2026

https://github.com/dbklim/optimized_tensorflow_wheels

Optimized versions TensorFlow and TensorFlow-GPU for specific CPUs and GPUs (for both old and new).

cuda nvidia-cuda nvidia-gpu tensorflow tensorflow-community-wheels tensorflow-gpu tensorflow-packages tensorflow-whells wheels

Last synced: 09 May 2026

https://github.com/lfrati/subpair

Fast pairwise cosine distance calculation and numba accelerated evolutionary matrix subset extraction 🍐🚀

cosine-distance cuda numba

Last synced: 09 May 2026

https://github.com/starlitdreams/lunar-landing

This project implements a DQN agent using PyTorch to solve the LunarLander-v2 environment from OpenAI Gym. The agent learns to control the lunar lander using experience replay and a target network, aiming to maximize rewards by landing smoothly. Uses CUDA for computation.

artificial-intelligence cuda deep-learning gymnasium neural-network neural-networks numpy nvidia-gpu python python3 torch

Last synced: 09 May 2026

https://github.com/donaurelio/ansible-playbooks

A Bunch of ansible-playbooks that automate computer infraestruture provisioning

ansible-playbooks cuda docker gromacs openmpi

Last synced: 09 May 2026

https://github.com/michaelfranzl/image_fah-client

Dockerfile for Folding@home client with AMD and Nvidia GPGPU support

container cuda debian docker foldingathome gpu-computing opencl

Last synced: 09 May 2026

https://github.com/edumucelli/build-tensorflow

Build Tensorflow from source using a Dockerfile

cuda cudnn docker tensorflow

Last synced: 10 May 2026

https://github.com/sebftw/interp2gpu

GPU-accelerated 2D spline interpolation, à la interp2(..., "spline"), in MATLAB.

cuda gpu gpu-acceleration matlab spline spline-interpolation

Last synced: 10 May 2026

https://github.com/cashcon57/open-supersampling

OpenSuperSampling (OSS) — vendor-agnostic open-source RT denoising, upscaling, and frame extrapolation

cuda deep-learning dlss frame-generation fsr game-engine gaussian-splatting open-source real-time-rendering super-resolution upscaling

Last synced: 10 Jun 2026

https://github.com/dlr-amr/t8gpu

Header-only finite volume library targetting GPUs using t8code as meshing backend.

adaptive-mesh-refinement cuda finite-volume gpgpu-computing hpc mesh mpi parallel-computing simulation

Last synced: 10 May 2026