Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
CUDA
![](https://explore-feed.github.com/topics/cuda/cuda.png)
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2025-02-15 00:06:58 UTC
- JSON Representation
https://github.com/shibatch/tlfloat
Template library for floating point operations
arbitrary-precision constexpr cplusplus cpp20 cuda float128 floating-point half-precision ieee754 math octuple-precision quadruple-precision templates
Last synced: 10 Jan 2025
https://github.com/dpetrosy/fractal
This project is a Fractal Visualizer developed in C++ with SFML and CUDA.
burning-ship cmake cmakelists cpp cpp-programming cpp-project cuda cuda-opengl cuda-programming fractal fractal-generation fractal-visualization julia mandelbox mandelbrot opengl opengl-project sfml sfml-library tricorn
Last synced: 21 Jan 2025
https://github.com/lablup/backend.ai-accelerator-cuda
The Backend.AI CUDA Accelerator Plugin
Last synced: 03 Jan 2025
https://github.com/lord-turmoil/cudacmakedemo
A demo for building CUDA program with CMake
Last synced: 23 Jan 2025
https://github.com/dongskie43/nlp-engineering-hub
📚 Enterprise NLP systems and LLM applications. Features custom language model implementations, distributed training pipelines, and efficient inference systems. 🔤
cuda gpu-optimization huggingface huggingface-transformers langchain language-models large-language-models nlp openai python transformers
Last synced: 03 Feb 2025
https://github.com/matiasvlevi/cuno
Provides cuda bindings, kernel maps and device memory managment for Dannjs computations. [Experimental and not complete]
addon cuda dann dannjs machine-learning nodejs
Last synced: 11 Jan 2025
https://github.com/dreamjet31/licence_plate_detection
Automated License Plate recognition system
cuda opencv python pytorch ultralytics yolov8
Last synced: 10 Feb 2025
https://github.com/adesoji1/youtubesummaryai
Python script for YouTube summary. The service should summarize an YouTube video by url. It should works for long video and for different languages.
cuda googleapi python3 speech-recognition transformers youtube-api-v3 youtube-dl
Last synced: 10 Feb 2025
https://github.com/hrolive/data-analytics-in-the-era-of-large-scale-machine-learning
Slides and other material for the Cyprus NCC training event about "Data analytics in the era of large-scale machine learning".
cuda deep-learning gpu-acceleration gradient-boosting large-language-models machine-learning preprocessing python pytorch
Last synced: 04 Jan 2025
https://github.com/mcobzarenco/bitonic.cu
CUDA bitonic sort in rust
cuda parallel-computing rust sorting-algorithms
Last synced: 10 Feb 2025
https://github.com/jonastoth/cuda_raytracer
University project to implement a basic Raytracer in CUDA
Last synced: 02 Feb 2025
https://github.com/kenwuqianghao/c4ai-cuda-birds
Homework assignments for C4AI Beginners in Research-Driven Studies
Last synced: 27 Dec 2024
https://github.com/d-krylov/cuda_to_opengl
Simple examples for CUDA OpenGL interoperability
Last synced: 11 Jan 2025
https://github.com/sebftw/interp2gpu
GPU-accelerated 2D spline interpolation, à la interp2(..., "spline"), in MATLAB.
cuda gpu gpu-acceleration matlab spline spline-interpolation
Last synced: 14 Dec 2024
https://github.com/ray-chew/modified_ch
Density functional theory (DFT) and self-consistent field theory (SCFT) simulation of diblock copolymers
cuda density-functional-theory diblock-copolymer numerical-analysis numerical-methods self-consistent-field-theory
Last synced: 11 Jan 2025
https://github.com/ivanbuccella/sf2bio
Deep reinforcement learning for de novo drug design: a ReLeaSe method execution on a Docker Environment
cuda deep-learning deep-reinforcement-learning docker docker-compose machine-learning nvidia-cuda nvidia-docker reinforcement-learning release release-method
Last synced: 11 Jan 2025
https://github.com/sandialabs/tenzing
Core library for optimizing CUDA+MPI programs as sequential decision problems.
cuda mpi scr-2759 sequential-decision-problem
Last synced: 11 Jan 2025
https://github.com/llm-db/understanding-gpu-architecture-implications-on-llm-serving-workloads
Understanding GPU Architecture Implications on LLM Serving Workloads (Master Thesis, ETH Zürich, 2024)
cuda inference pytorch rocm transformer
Last synced: 14 Dec 2024
https://github.com/brainlesslabs/jalebi
C++ String algorithms for maximum performance
c-plus-plus cplusplus cpp cpp-library cpu cuda library parallel performance simd sse string string-matching vectorization
Last synced: 26 Jan 2025
https://github.com/proafxin/cuda-docker
High performance computing Images with pycuda and tensorrt preinstalled
cuda docker dockerfile libcudnn nvidia-tensorrt pycuda python tensorrt
Last synced: 12 Jan 2025
https://github.com/jaidevd/ipec-fdp
cuda hpc keras mapreduce numba spark tensorflow
Last synced: 01 Feb 2025
https://github.com/usman619/pdc
Parallel and Distributed Computing
cuda distributed-computing distributed-systems nextcloud
Last synced: 13 Jan 2025
https://github.com/kichappa/spy-sim
Simulate a spying strategy on a topography
combat-modeling cuda differential-equations julia modeling-and-simulation topography-simulation
Last synced: 12 Jan 2025
https://github.com/parxd/fasterdl
cuBLAS/CUDA tensor library with auto-diff support
cublas cuda cudnn deep-learning machine-learning
Last synced: 06 Jan 2025
https://github.com/simonschoelly/poisson-solver
A solver for a modified poisson equation using cuda.
cpp cuda finite-difference gpgpu pgc poisson-equation preconditioned-conjugate-gradient thomas-algorithm
Last synced: 12 Jan 2025
https://github.com/hr-fahim/transformer-model-optimization
Sample GPT Transformer Model from Scratch.
cuda few-shot-learning transfomers
Last synced: 24 Jan 2025
https://github.com/imanghd/parallelprocessing
CE Algorithms Lab @ SUT
cuda openmp parallel-algorithm parallel-processing systolic
Last synced: 02 Feb 2025
https://github.com/saiccoumar/cuda-programming-exercises
Brief collection of GPU exercises (my reimplementation). Comes with relevant resources.
cuda cuda-programming nvcc nvidia
Last synced: 18 Jan 2025
https://github.com/deep-1704/coa-lab-repo
Computer Organization and Architecture lab assignments.
Last synced: 18 Jan 2025
https://github.com/kanchishimono/python-images
Ubuntu based Python container images, including CUDA images
container-image cuda docker dockerfile machine-learning python python3
Last synced: 26 Jan 2025
https://github.com/mattjesc/federated-learning-simulation-1gpu-mi-is
Federated Learning Simulation on a Single GPU with Model Interpretability and Interactive Visualization
ai cuda deep-learning distributed-systems federated-learning gpu hpc keras machine-learning ml model-interpretability python pytorch simulation streamlit tensorflow
Last synced: 12 Oct 2024
https://github.com/sferez/sspp_sparse_matrix_cuda
Small Scale Parallel Programming, Sparse Matrix multiplication with CUDA
cpp cuda omp omp-parallel parallel-computing small-scale-parallel-programming sparse-matrix
Last synced: 13 Jan 2025
https://github.com/vectorworksreal/sd-forge-docker
sd forge webui docker image.
ai-art artificial-intelligence containerization cuda docker docker-image forge image-to-image machine-learning sd-forge stable-diffusion stable-diffusion-webui text-to-image ubuntu webui
Last synced: 10 Feb 2025
https://github.com/betarixm/csed490c
POSTECH: Heterogeneous Parallel Computing (Fall 2023)
cuda gpu parallel-computing postech
Last synced: 19 Jan 2025
https://github.com/wojcikmikolaj/particles-in-a-jar
Collisions between particles simulated on GPU.
algorithms-and-data-structures collision-detection collisions cuda gpu-programming
Last synced: 19 Jan 2025
https://github.com/amitkumarj441/deep-learning-on-your-finger
A rich collection of dockerfiles for installing deep learning dependecies on your way :rocket:
Last synced: 26 Jan 2025
https://github.com/dhruvsrikanth/fastconv
Distributed and serial implementations of the 2D Convolution operation in c++ and CUDA.
convolution-filters cpp cuda gpu-programming high-performance-computing hpc image-editor image-processing nvidia parallel-programming
Last synced: 25 Dec 2024
https://github.com/illagrenan/cuda-80-cudnn6-runtime-1604-py36
Ubuntu 16.04 with Python 3.6 and CUDA Dockerfile
Last synced: 19 Jan 2025
https://github.com/illagrenan/cuda-90-cudnn7-runtime-1604-py36
Ubuntu 16.04 with Python 3.6 and CUDA9 Dockerfile
Last synced: 19 Jan 2025
https://github.com/larygwil/cuda-samples-old
nvidia cuda samples old (5.0 - 7.5)
Last synced: 02 Feb 2025
https://github.com/aaditya29/parallel-computing-and-cuda
Learning about Parallel Computing and GPU programming using CUDA.
c cpp cuda cuda-kernels cuda-programming nvidia-cuda openmp openmpi parallel-computing parallel-programming
Last synced: 07 Feb 2025
https://github.com/dwain-barnes/llm-gguf-auto-converter
Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.
auto-converter batch-processing cuda gguf huggingface jupyter-notebook llama-cpp llm model-quantization
Last synced: 31 Jan 2025
https://github.com/rssr25/cuda
Following Cuda By Example book.
cpp cuda cuda-programming hpc shaders
Last synced: 24 Dec 2024
https://github.com/phrutis/minikeys_for_sale
GPU program for brute MiniKeys Casascius Serie1 (22 characters)
bitcoin brute-force btc casascius cuda gpu minikeys program uncompressed
Last synced: 24 Jan 2025
https://github.com/storterald/neural-network
Simple neural network implementation in C++ and CUDA
asm asmx86 c-plus-plus cmake cpp cuda machine-learning neural-network
Last synced: 02 Feb 2025
https://github.com/ludgerpaehler/lulesh-enzyme
AD with Enzyme through Lulesh.
automatic-differentiation cuda cuda-programming gpu-computing high-performance-computing llvm-enzyme scientific-computing
Last synced: 05 Jan 2025
https://github.com/abdelrahman-amen/active_learning_with_different_query_strategies
This project explores the implementation of active learning techniques, focusing on various query strategies to optimize the selection of informative data points for model training. It aims to reduce the amount of labeled data required while improving model performance, especially in scenarios with limited labeled data.
activelearning cuda entropy kldivergence margin numpy python pyto uncertainty
Last synced: 24 Jan 2025
https://github.com/sugarcane-mk/finetuning_wav2vec2
This repo provides step by step process from sctatch to fine tune facebook's wav2vec2-large model using transformers
asr asr-model cuda facebook fairseq fine-tuning finetuning huggingface librosa python torch transformers wav2vec2 wav2vec2-large-960h
Last synced: 24 Jan 2025
https://github.com/marius311/cudadistributedtools.jl
A set of utility tools for multi-GPU + multi-process workflows
Last synced: 07 Feb 2025
https://github.com/ramyacp14/document-based-question-and-answers
Developed a document question answering system that utilizes Llama and LangChain for contextual and accurate answers. The system supports .txt documents, intelligent text splitting, and context-aware querying through an easy-to-use Streamlit interface.
chroma cuda hugging-face langchain llama python recursivecharactertextsplitter streamlit
Last synced: 12 Oct 2024
https://github.com/bjornmelin/ml-production-engineering
⚙️ End-to-end ML deployment solutions. Focused on model serving, multi-GPU optimization, and production-grade system implementation. 🎯
cuda deployment docker fastapi gpu-computing kubernetes mlops production
Last synced: 25 Jan 2025
https://github.com/amypad/miutil
Basic functionality needed for AMYPAD
cuda matlab medical-imaging python
Last synced: 31 Oct 2024
https://github.com/bjornmelin/ml-vision-lab
👁️ Production-grade computer vision implementations. Real-world applications in image processing, object detection, and video analytics with GPU acceleration. 📸
computer-vision cuda deep-learning image-processing object-detection opencv pytorch video-analytics
Last synced: 25 Jan 2025
https://github.com/jegp/aestream-paper
AEStream paper
coroutines cuda event-based-vision gpu
Last synced: 08 Feb 2025
https://github.com/bjornmelin/nlp-engineering-hub
📚 Enterprise NLP systems and LLM applications. Features custom language model implementations, distributed training pipelines, and efficient inference systems. 🔤
cuda gpu-optimization huggingface huggingface-transformers langchain language-models large-language-models nlp openai python transformers
Last synced: 25 Jan 2025
https://github.com/mateuszk098/parallel-programming-examples
Simple parallel programming examples with CUDA, MPI and OpenMP.
cpp cuda mpi openmp parallel-programming
Last synced: 28 Dec 2024
https://github.com/neel-dandiwala/cuda-programs
Miscellaneous programs that grasp the concept of Parallel Computing
cuda gpu-programming parallel-programming
Last synced: 26 Dec 2024
https://github.com/corazon-code/pyloo
Python package for approximate leave-one-out cross-validation (LOO-CV) and Pareto smoothed importance sampling (PSIS) for Bayesian Modeling
bayes bayesian-data-analysis cross-validation cuda dump fuzzy-matching looker loot-table machine-learning minecraft model-comparison python spreadsheet tensorflow
Last synced: 09 Feb 2025
https://github.com/voltr0x/raytracing-cuda
Raytracing in a weekend using CUDA
Last synced: 20 Jan 2025
https://github.com/jpodivin/gputomata
Cellular automata running on CUDA capable GPUs
cellular-automata cellular-automaton cuda
Last synced: 27 Dec 2024
https://github.com/bjornmelin/ml-algorithm-playground
🧪 Core ML algorithm implementations with GPU acceleration. Featuring optimized implementations across various libraries with comprehensive analysis. 📈
algorithms cuda gpu-computing lightgbm machine-learning python scikit-learn xgboost
Last synced: 25 Jan 2025
https://github.com/xstupi00/N-Body-CUDA
PCG - Parallel Computations on GPU - Project - N-Body-CUDA
cuda gpu-acceleration gpu-computing nbody-simulation optimization parallel-computing pcg vut vut-fit
Last synced: 23 Oct 2024
https://github.com/pauloruszel/yolo11_face_detection
cuda nvcc nvidia-gpu pip python3 pytorch widerface-dataset yolo11
Last synced: 09 Feb 2025
https://github.com/miferreiro/cdap-cuda
CUDA exercises for the subject of "Computación Distribuída e de Altas Prestacións" in the Master Degree of Computer Engineering of the University of Vigo in 2020
Last synced: 27 Dec 2024
https://github.com/muneeb706/cuda
sample programs implemented using cuda (gpu)
cplusplus cuda gpu-programming
Last synced: 31 Jan 2025
https://github.com/jamesnulliu/learning-programming-massively-parallel-processors
Leaning notes of Programming Massively Parallel Processors, 4-th edition.
Last synced: 02 Feb 2025
https://github.com/xza85hrf/flag_prediction_project
This application predicts the name of a country (or countries) based on an input flag image. It uses advanced image processing techniques and deep learning models built with PyTorch to classify flags accurately.
cross-validation cuda data-augmentation docker efficientnetb0 flag-recognition image-classification machine-learning mixed-precision-training mobilenetv2 python pytorch resnet resnet-50 transfer-learning
Last synced: 31 Jan 2025
https://github.com/sangioai/torchpace
PyTorch CUDA/C++ extension of PACE: Transformer non-linearlity accelerator engine.
Last synced: 02 Feb 2025
https://github.com/rajshrestha86/kmeans-clusterize-cuda
Implementation of K-Means algorithm from scratch using CUDA.
Last synced: 07 Feb 2025
https://github.com/bjornmelin/edge-ai-engineering
📱 Optimized ML for edge devices. Showcasing efficient model deployment, GPU-CPU memory transfer optimization, and real-world edge AI applications. 🤖
cuda edge-computing embedded-systems gpu-optimization iot mobile-ml model-optimization python tflite
Last synced: 02 Feb 2025
https://github.com/ne0nwinds/gpupuzzles
My solutions to srush/GPU-Puzzles using CUDA
Last synced: 02 Feb 2025
https://github.com/ysl1016/cudadigitfilter
CUDA-based parallel image filtering system for MNIST dataset
computer-vision cuda deep-learning gpu-acceleration image-processing mnist parallel-computing
Last synced: 02 Feb 2025
https://github.com/bjornmelin/ai-system-design
🎨 Large-scale AI system architectures and implementations. Features distributed training systems, multi-GPU pipelines, and efficient resource management. 🏗️
architecture cuda distributed-systems engineering gpu-computing production scalability system-design
Last synced: 02 Feb 2025
https://github.com/rzxmha/linear_algebra
Linear Algebra project from TripleTen
blas computational-science cuda data-science data-visualization eigenvectors gram-schmidt linear-transformations matrix-calculations numpy nvidia python symmetric-matrices typescript
Last synced: 02 Feb 2025
https://github.com/jiriklepl/bits-knn-jpdc2024
Replication package for the paper Towards Optimal GPU-accelerated K-Nearest Neighbors Search
bitonic-sort cuda gpu k-nearest-neighbors knn-search top-k
Last synced: 26 Jan 2025
https://github.com/wiktor2718/matrix_flow
Matrix Flow is a simple machine learning library written in Rust and CUDA. It was created as a portfolio project to deepen my understanding of machine learning, GPU programming, and Rust. It provides an API for matrix manipulation and includes specially optimized neural networks.
adam-optimizer benchmarking cuda deep-learning gpu-computing machine-learning matrix-operations neural-networks portfolio-project rust
Last synced: 26 Jan 2025
https://github.com/jxtngx/cuda-lab
simple CUDA kernels and Python bindings
artificial-intelligence cpp cuda deep-learning machine-learning neural-networks python
Last synced: 26 Jan 2025
https://github.com/lruizap/testcuda
Guide to install and use cuda for programming
Last synced: 02 Feb 2025
https://github.com/marcorentap/kokkos-docker-cluster
Deploy Docker containers with Kokkos, OpenMP, OpenMPI and CUDA as a Docker swarm.
Last synced: 23 Oct 2024
https://github.com/amruthapatil/nyu-cudamatrixoperations
Optimizing CUDA programs for vector addition and matrix multiplication
cuda high-performance-computing
Last synced: 20 Jan 2025
https://github.com/djenriquez/ccminer
Dockerized ccminer
cuda docker ethereum mining nvidia nvidia-docker
Last synced: 01 Feb 2025
https://github.com/amruthapatil/nyu-cudaconvolution
Implementing convolution operations on an image using CUDA, exploiting different methodologies - basic, tiled, and cuDNN
Last synced: 20 Jan 2025
https://github.com/trentonom0r3/raft-analysis
Simple analysis script 'demotest.py' using RAFT optical flow to get flow vectors, occlusion masks, and Information on keyframes with significant motion changes
cuda flow-maps occlusion-masks opticalflow python pytorch raft
Last synced: 08 Feb 2025
https://github.com/cppshizoids/cuda
This is my basic lessons of CUDA
cuda cuda-demo cuda-programming
Last synced: 14 Feb 2025
https://github.com/popke523/rybki
A 3D shoal of fish animation using the boids algorithm, OpenGL for rendering and CUDA for parallel processing.
Last synced: 08 Feb 2025
https://github.com/voschezang/holographic-projector-simulations
Optimizations of Simulations of Holographic Projectors using CUDA
cuda gpu holography parallel-computing photonics
Last synced: 05 Jan 2025
https://github.com/camille-004/cusprec
🏁 Sparse signal recovery library written in PyCUDA.
cuda ml python signal-processing sparse-recovery
Last synced: 19 Dec 2024
https://github.com/phantom7knight/cuda-fusion
This project is for learning CUDA to understand the GPU work better.
cuda cuda-programming gpgpu gpu
Last synced: 08 Feb 2025