Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/llm-db/understanding-gpu-architecture-implications-on-llm-serving-workloads

Understanding GPU Architecture Implications on LLM Serving Workloads (Master Thesis, ETH Zürich, 2024)

cuda inference pytorch rocm transformer

Last synced: 14 Dec 2024

https://github.com/cerit-sc/scipion-docker

Scipion (Cryo em image processing framework (https://scipion.i2pc.es/)) adapted to run in Kubernetes.

cryo-em cryoem cuda desktop kubernetes scipion vnc

Last synced: 06 Dec 2024

https://github.com/bhavinpatel4199/image-processing-with-opencv-and-cuda-on-google-colab

This repository demonstrates image processing using OpenCV with CUDA for GPU acceleration on Google Colab. It includes basics like displaying and manipulating images, alongside advanced techniques using CUDA to enhance performance. Ideal for learning GPU-accelerated image processing in Python.

computer-vision cuda google-colab gpu-acceleration high-performance-computing image-processing opencv pixel-manupulation

Last synced: 12 Feb 2025

https://github.com/hailiang-wang/cuda-get-started

Get started with CUDA

cuda machine-learning nvidia

Last synced: 07 Jan 2025

https://github.com/kenwuqianghao/c4ai-cuda-birds

Homework assignments for C4AI Beginners in Research-Driven Studies

cuda machine-learning pytorch

Last synced: 27 Dec 2024

https://github.com/wpjunior/cuda-numba-playground

Some uses of cuda with numba framework

cuda numba python

Last synced: 13 Jan 2025

https://github.com/kentakoong/mtnlog

A simple multinode performance logger for Python

cuda lanta nvitop python slurm-cluster

Last synced: 22 Jan 2025

https://github.com/cscfi/csc-env-julia

Julia language environment including MPI.jl, CUDA.jl and AMDGPU.jl preferences for HPC clusters at CSC.

amdgpu ansible cuda hpc julia julia-language mpi

Last synced: 22 Jan 2025

https://github.com/bardiparsi/threadpoolmanager

ThreadPoolManager is a C++ project that implements an efficient multi-threading system using a thread pool for generic functions of the same type and different tasks. It includes task management, synchronization mechanisms, and thread-safe logging to demonstrate concurrent task execution.

cpp cpp17 cpp20 cuda cuda-programming memory-management multiprocessing multithreading parallel-computing parallel-processing parallel-programming thread thread-pool thread-safety threadpool threads threadsafe

Last synced: 19 Nov 2024

https://github.com/vladd12/libexecstd

Modern C++ library for using an execution context of computer devices

cpp cpp17 cuda gpu-acceleration gpu-computing

Last synced: 28 Jan 2025

https://github.com/ydkn/htw-progko-cuda

Parallel processing of image transformations. Part of the "Programmierkonzepte und Algorithmen" course at HTW-Berlin.

cuda image-transformations opencv

Last synced: 11 Jan 2025

https://github.com/zalo/matmul_cuda

A simple learning example for CUDA

cuda

Last synced: 14 Jan 2025

https://github.com/usman619/pdc

Parallel and Distributed Computing

cuda distributed-computing distributed-systems nextcloud

Last synced: 13 Jan 2025

https://github.com/juntyr/necsim-rust-analysis

Analysis of the spatially explicit biodiversity simulation `necsim-rust`

analysis biodiversity cuda mpi necsim rust simulation

Last synced: 25 Jan 2025

https://github.com/neugence/acehub

AI Champions for Excellence: Fresh, informative courses and content designed to help developers, researchers, and leaders advance in the field of AI.

ai cuda cv ml mlops nlp pytorch rl rlhf tensorflow

Last synced: 13 Oct 2024

https://github.com/sangioai/sph

CUDA and OpenMP versions of SPH (Smoothed Particle Hydrodynamics) serial algorithm.

cuda openmp

Last synced: 12 Feb 2025

https://github.com/sahil-rajwar-2004/vector-cuda

vector calculation with GPU acceleration using CUDA

c cpp11 cuda cuda-kernels cuda-programming nvcc

Last synced: 19 Nov 2024

https://github.com/zjeffer/docker-arch-cuda

Arch Linux base image with the latest CUDA, CUDNN and LibTorch preinstalled.

archlinux cuda docker libtorch pytorch

Last synced: 20 Dec 2024

https://github.com/dirmeier/cuda-etudes

:notes: A collection of CUDA recipes

cpp cuda meson

Last synced: 17 Jan 2025

https://github.com/mathiasotnes/gemm

General Matrix Multiplication (GEMM) optimization in Cuda.

cuda gpu

Last synced: 31 Jan 2025

https://github.com/parxd/fasterdl

cuBLAS/CUDA tensor library with auto-diff support

cublas cuda cudnn deep-learning machine-learning

Last synced: 06 Jan 2025

https://github.com/lordofhyphens/gpu-path-delay-coverage

CUDA-based Path Delay Fault Coverage

cpp cuda gpgpu moderngpu

Last synced: 28 Jan 2025

https://github.com/rmeli/cuda-pg

CUDA C++ Playground

cpp cuda gpu

Last synced: 01 Feb 2025

https://github.com/fabulani/360ip-with-cuda

360° Image Processing with CUDA and OpenCV.

360-image 360-video cpp cuda image-processing opencv

Last synced: 08 Feb 2025

https://github.com/edisonslightbulbs/viewer

Exploring real-time 3D point cloud rendering using Cuda and openGL

cuda cxx11 opengl pangolin submodule

Last synced: 14 Jan 2025

https://github.com/lfrati/subpair

Fast pairwise cosine distance calculation and numba accelerated evolutionary matrix subset extraction 🍐🚀

cosine-distance cuda numba

Last synced: 16 Jan 2025

https://github.com/jpuigcerver/prob-phoc

Probabilistic relevance scores from PHOC embeddings

cuda keyword-spotting kws phoc pytorch

Last synced: 16 Jan 2025

https://github.com/m-torhan/advent-of-code

🎄 Solutions for the Advent of Code

advent-of-code advent-of-code-2024 cuda

Last synced: 20 Dec 2024

https://github.com/chrisdalvit/gpu-matrix-transpose

Implementation and benchmarking of different matrix transpose with CUDA

c cpp cuda cuda-kernels cuda-programming gpu-acceleration gpu-computing gpu-programming matrix-transpose nvidia-gpu

Last synced: 20 Dec 2024

https://github.com/darshanakgr/meanfiltergpu

A gpu implementation of mean filter in CUDA

c cuda image-processing

Last synced: 28 Jan 2025

https://github.com/rog0d/gpuss_watchers

"The GPU Watchers swore upon their shared memory hierarchy, from L1 to global memory, which also served as their mandate as lords of parallel computation."

cuda gpu-acceleration gpu-monitoring gpu-profiling

Last synced: 20 Dec 2024

https://github.com/programmergnome/cuda-codes

Snippet repository for learning parallel GPU programming with CUDA.

c cpp-programming cuda cuda-kernel gpu-programming learning-materials parallel-programming parallelization

Last synced: 22 Jan 2025

https://github.com/snandasena/courseera_gpu_specilization

Example for Cuda streaming

c cpp cuda

Last synced: 14 Jan 2025

https://github.com/flavienbwk/tensorflow2-cuda-10.2-docker

Tensorflow 2.3, CUDA 10.2, Docker compatible image

cuda docker python3 tensorflow ubuntu1804

Last synced: 28 Jan 2025

https://github.com/boostibot/bachelors

My bachelors thesis at CTU in Prague, Faculty of Nuclear Sciences and Physical Engineering supervised by Ing. Pavel Strachota, Ph.D

crystal-growth cuda finite-volume-method parallel-programming phase-field-method

Last synced: 18 Jan 2025

https://github.com/sustia-llc/gpu_logger_poc

GPU execution verification system with immutable Kafka logging. Monitors CUDA operations, validates GPU performance, and maintains auditable operation history. Built with Rust and Candle for reliable ML model execution tracking.

candle-core cuda docker gpu gpu-computing kafka logging machine-learning mlops monitoring nvidia performance-testing rust

Last synced: 12 Feb 2025

https://github.com/grindelfp/cuda-n-body-simulation

Simulation of N-Body movement using CUDA.

cuda n-body-simulation

Last synced: 12 Feb 2025

https://github.com/jonyandunh/stanforddogsresnet

A classifier for 120 dogs classified at Stanford Dogs Dataset, using the Pytorch framework and using custom Resnet for neural network learning

cuda deep-learning python pytorch resnet resnet-18 standford-dog stanford

Last synced: 14 Jan 2025

https://github.com/sydney-informatics-hub/computer-vision-fine-tuning

Fine tune a computer vision to solve your task locally, on HPC, in a container, or in the cloud!

computer-vision cuda deep-learning python

Last synced: 22 Jan 2025

https://github.com/kanchishimono/python-images

Ubuntu based Python container images, including CUDA images

container-image cuda docker dockerfile machine-learning python python3

Last synced: 26 Jan 2025

https://github.com/akhuntsaria/image-filters

Image filters implemented in CUDA C/C++

cuda image-processing

Last synced: 07 Jan 2025

https://github.com/mattjesc/federated-learning-simulation-1gpu-mi-is

Federated Learning Simulation on a Single GPU with Model Interpretability and Interactive Visualization

ai cuda deep-learning distributed-systems federated-learning gpu hpc keras machine-learning ml model-interpretability python pytorch simulation streamlit tensorflow

Last synced: 12 Oct 2024

https://github.com/dragonscypher/prompty

Tool for generating smart and secure prompts for language models!

autotokenizer bert-model cuda google-t5 llm python3 tensorflow threading

Last synced: 22 Jan 2025

https://github.com/raiszo/cs334

Journey thorugh Intro to Parallel Programming

cmake cs334 cuda msbuild

Last synced: 25 Jan 2025

https://github.com/sarah627/horus_eye_fcih_graduation_project

An AI-powered tourism website using YOLOv7 for real-time landmark detection in images. Built with Flask, PyTorch, and Roboflow for seamless tourist interaction.

computer-vision cuda flask jupyter-notebook kaggle matplotlib object-detection opencv python pytorch roboflow

Last synced: 21 Jan 2025

https://github.com/awikramanayake/optimized-matrix-mult

Optimizing matrix multiplication using parallelism and SIMD (AVX2, CUDA)

avx2 cuda matrix-multiplication

Last synced: 21 Jan 2025

https://github.com/bd2720/accesspatterns

Comparing chunked vs. striped memory access patterns for CPU and GPU code using the CUDA toolkit in C.

c cache cuda cuda-toolkit performance-analysis performance-testing profiling

Last synced: 31 Jan 2025

https://github.com/branebb/nn-framework

Framework for creating neural networks using C++ and CUDA platform. This project is part of my final university assignment for bachelor's degree.

cmake cpp cuda cuda-programming

Last synced: 19 Nov 2024

https://github.com/parlaynu/inference-tvm

Export ONNX to ApacheTVM and run inference in containerized environments.

apache-tvm cuda docker jetson-nano onnx raspberrypi4 x86-64

Last synced: 28 Jan 2025

https://github.com/fikri-rouzan/cuda-c-program-part-3

CUDA C program from NVIDIA course.

c cuda

Last synced: 05 Feb 2025

https://github.com/fikri-rouzan/cuda-c-program-part-1

CUDA C program from NVIDIA course.

c cuda

Last synced: 05 Feb 2025

https://github.com/fikri-rouzan/cuda-c-program-part-2

CUDA C program from NVIDIA course.

c cuda

Last synced: 05 Feb 2025

https://github.com/thomasvonwu/interview-note

Share Interview Questions and Summarize Answers

cuda interview llm

Last synced: 05 Feb 2025

https://github.com/kts-o7/n-body-parallel-implementation

A simple study to compare the speed-up obtained by using different parallelization formats like MPI,OpenMP and CUDA for FFT implementation of n-body simulation

cuda mpi openmp parallel-computing pthreads

Last synced: 05 Feb 2025

https://github.com/f14-bertolotti/torchess

cuda torch extension for a chess engine

chess cuda torch

Last synced: 05 Feb 2025

https://github.com/pintamonas4575/rlgan-project-maadm-upm

Neuroevolution to learn the Lunar Lander from Gymnasium and a GAN to learn to color images. Subject from the ML and BD master´s degree of UPM.

cuda deep-learning gan genetic-algorithm lunar-lander machine-learning mlp python3 pytorch reinforcement-learning tensorflow

Last synced: 05 Feb 2025

https://github.com/rushirg/cuda-matrix-multiplication

Matrix Multiplication on GPGPU in CUDA

cpu cuda gpu parallel-processing

Last synced: 21 Jan 2025

https://github.com/ivanbgd/cuda_quad_c

Calculates a definite integral by using three different rules. Compares sequential to parallel implementations.

cuda integrals parallel-implementations

Last synced: 03 Feb 2025

https://github.com/daskol/gpgpu

cuda gpgpu

Last synced: 12 Jan 2025

https://github.com/karusb/2dca-cuda

2 Dimensional Cellular Automata Visualisation (Game of Life)

algorithm-flowchart cellular-automata cuda game game-of-life glut visual-studio

Last synced: 08 Jan 2025

https://github.com/ghusta/jcuda-demo

JCUDA demo

cuda java nvidia

Last synced: 06 Jan 2025

https://github.com/rurumimic/candle

huggingface candle

cuda gpu huggingface nvidia transformer

Last synced: 27 Jan 2025

https://github.com/flolu/hardware-praktikum

SoSe 2021 Hardware Praktikum

college cuda hardware

Last synced: 09 Jan 2025

https://github.com/emilienmendes/gpgpu

Parallélisation et optimisation de reconnaissance de point dans une image

cuda gpgpu parallel-programming

Last synced: 27 Jan 2025

https://github.com/strigidie/cudar

The custom graphics pipeline based on NVIDIA CUDA ⚙️

cuda graphics-pipeline

Last synced: 27 Jan 2025

https://github.com/ribin-baby/cuda_cudnn_installation_on_ubuntu20.04

Installation of CUDA-11.8 with cuDNN-8.7 for ubuntu(20.04) server A30 GPU, and onnx gpu installation guide

cuda gpu linux onnxruntime server

Last synced: 16 Jan 2025

https://github.com/gladap/heterogeneous_computing_project

Heterogeneous parallel programming exercise using OpenMP and CUDA to parallelize image filters

cuda heterogeneous-parallel-programming

Last synced: 05 Feb 2025

https://github.com/sferez/sspp_sparse_matrix_cuda

Small Scale Parallel Programming, Sparse Matrix multiplication with CUDA

cpp cuda omp omp-parallel parallel-computing small-scale-parallel-programming sparse-matrix

Last synced: 13 Jan 2025

https://github.com/isquicha/cuda-parallel-studies

Learning CUDA programming here =D

cuda cuda-programming cuda-toolkit

Last synced: 22 Jan 2025

https://github.com/toshikinakamura0412/dotfiles_for_docker

My dotfiles for docker of some linux distribution

cuda docker docker-compose dotfiles git neovim ros-noetic tmux zsh

Last synced: 20 Nov 2024

https://github.com/versi379/optimized-matrix-multiplication

This project utilizes CUDA and cuBLAS to optimize matrix multiplication, achieving up to a 5x speedup on large matrices by leveraging GPU acceleration. It also improves memory efficiency and reduces data transfer times between CPU and GPU.

cublas cuda cuda-programming hpc matrix-multiplication parallel-computing parallel-programming

Last synced: 21 Jan 2025

https://github.com/neel-dandiwala/cuda-programs

Miscellaneous programs that grasp the concept of Parallel Computing

cuda gpu-programming parallel-programming

Last synced: 26 Dec 2024

https://github.com/xza85hrf/flux_pipeline

FluxPipeline is a prototype experimental project that provides a framework for working with the FLUX.1-schnell image generation model. This project is intended for educational and experimental purposes only.

ai cuda docker educational experimental flux1 flux1-schnell flux1ai gradio image-generation model non-commercial python pytorch research transformer-model

Last synced: 22 Dec 2024

https://github.com/juntyr/necsim-rust-docs

Documentation of the spatially explicit biodiversity simulation necsim-rust

biodiversity cuda docs mpi necsim rust simulation

Last synced: 03 Feb 2025

https://github.com/cs550-epfl/review

Review of the paper A Formal Analysis of the NVIDIA PTX Memory Consistency Model

cuda formal-verification gpu memory-consistency ptx simt

Last synced: 05 Feb 2025

https://github.com/skyguy126/cuda-learnings

Collection of personal CUDA learnings.

cuda

Last synced: 05 Feb 2025

https://github.com/jamezchard/s1mple_c0mpute

some compute (gpgpu) codes

c cpp cuda gpgpu

Last synced: 05 Feb 2025

https://github.com/roryclear/warp-shuffle-demo

warp reduce example

cuda warp

Last synced: 05 Feb 2025

https://github.com/spatialgraphics/tardis

Travel space and time by using autodiff and codegen

autodiff codegen cuda

Last synced: 05 Feb 2025

https://github.com/sbstndb/neural_k

A simple Neural Network library using Kokkos enabling CUDA or OpenMP backend

ai cuda kokkos library neural-network openmp

Last synced: 05 Feb 2025

https://github.com/phrutis/brainwords2

GPU brainflayer for sale $250

brain brainflayer brainwords cuda gpu key pass passphrase private

Last synced: 05 Feb 2025

https://github.com/alexkranias/triton_vs_cuda

Building Triton and CUDA kernels side-by-side to create a cuBLAS-performant GEMM kernel.

cuda cuda-kernels gpu gpu-programming parallel-programming python triton

Last synced: 05 Feb 2025