Projects in Awesome Lists tagged with gpu-acceleration
A curated list of projects in awesome lists tagged with gpu-acceleration .
https://github.com/tensorflow/tfjs
A WebGL accelerated JavaScript library for training and deploying ML models.
deep-learning deep-neural-network gpu-acceleration javascript machine-learning neural-network typescript wasm web-assembly webgl
Last synced: 09 Sep 2025
https://github.com/nvidia/tensorrt
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
deep-learning gpu-acceleration inference nvidia tensorrt
Last synced: 09 Sep 2025
https://github.com/NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
deep-learning gpu-acceleration inference nvidia tensorrt
Last synced: 20 Mar 2025
https://github.com/tensorflow/tfjs-core
WebGL-accelerated ML // linear algebra // automatic differentiation for JavaScript.
deep-learning deep-neural-networks gpu-acceleration javascript machine-learning neural-network typescript webgl
Last synced: 30 Sep 2025
https://github.com/raphamorim/rio
A hardware-accelerated GPU terminal emulator focusing to run in desktops and browsers.
gpu-acceleration rio rio-terminal rust rust-lang terminal terminal-emulator terminal-emulators terminal-ui vte wgpu
Last synced: 15 Dec 2025
https://github.com/cornellius-gp/gpytorch
A highly efficient implementation of Gaussian Processes in PyTorch
gaussian-processes gpu-acceleration pytorch
Last synced: 13 May 2025
https://github.com/nvidia/generativeaiexamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
gpu-acceleration large-language-models llm llm-inference microservice nemo rag retrieval-augmented-generation tensorrt triton-inference-server
Last synced: 13 May 2025
https://github.com/NVIDIA/GenerativeAIExamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
gpu-acceleration large-language-models llm llm-inference microservice nemo rag retrieval-augmented-generation tensorrt triton-inference-server
Last synced: 28 Mar 2025
https://github.com/Hedgehog-Computing/hedgehog-lab
Run, compile and execute JavaScript for Scientific Computing and Data Visualization TOTALLY TOTALLY TOTALLY in your BROWSER! An open source scientific computing environment for JavaScript TOTALLY in your browser, matrix operations with GPU acceleration, TeX support, data visualization and symbolic computation.
computer-algebra data-visualization gpu-acceleration javascript latex machine-learning matrix-library scientific-computing symbolic-computation tex webgl webgl2
Last synced: 30 Mar 2025
https://github.com/hedgehog-computing/hedgehog-lab
Run, compile and execute JavaScript for Scientific Computing and Data Visualization TOTALLY TOTALLY TOTALLY in your BROWSER! An open source scientific computing environment for JavaScript TOTALLY in your browser, matrix operations with GPU acceleration, TeX support, data visualization and symbolic computation.
computer-algebra data-visualization gpu-acceleration javascript latex machine-learning matrix-library scientific-computing symbolic-computation tex webgl webgl2
Last synced: 15 May 2025
https://github.com/BlazingDB/blazingsql
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
arrow artificial-intelligence blazingsql conda-environment cudf data-science gpu gpu-acceleration gpu-dataframes machine-learning machine-learning-workflow python rapids rapidsai sql sql-engine
Last synced: 25 Mar 2025
https://github.com/blazingdb/blazingsql
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
arrow artificial-intelligence blazingsql conda-environment cudf data-science gpu gpu-acceleration gpu-dataframes machine-learning machine-learning-workflow python rapids rapidsai sql sql-engine
Last synced: 15 May 2025
https://github.com/tianzerl/anime4kcpp
A high performance anime upscaler
anime anime4k anime4kcpp avisynth avisynthplus-plugin cnn computer-graphics cpp directshow-filter gpu-acceleration machine-learning upscaling vapoursynth vapoursynth-plugin video-processing
Last synced: 14 May 2025
https://github.com/TianZerL/Anime4KCPP
A high performance anime upscaler
anime anime4k anime4kcpp avisynth avisynthplus-plugin cnn computer-graphics cpp directshow-filter gpu-acceleration machine-learning upscaling vapoursynth vapoursynth-plugin video-processing
Last synced: 06 May 2025
https://github.com/coreylowman/dfdx
Deep learning in Rust, with shape checked tensors and neural networks
autodiff autodifferentiation autograd backpropagation cuda cuda-kernels cuda-support cuda-toolkit cudnn deep-learning deep-neural-networks gpu gpu-acceleration gpu-computing machine-learning neural-network rust rust-lang tensor
Last synced: 14 May 2025
https://github.com/emacs-ng/emacs-ng
A new approach to Emacs - Including TypeScript, Threading, Async I/O, and WebRender.
async deno emacs emacs-ng gpu gpu-acceleration javascript rust wasm webassembly webrender webworkers
Last synced: 14 May 2025
https://github.com/nvidia/cccl
CUDA Core Compute Libraries
accelerated-computing cpp cpp-programming cuda cuda-cpp cuda-kernels cuda-library cuda-programming gpu gpu-acceleration gpu-computing gpu-programming hpc modern-cpp nvidia nvidia-gpu parallel-algorithm parallel-computing parallel-programming
Last synced: 13 May 2025
https://github.com/calebwin/emu
The write-once-run-anywhere GPGPU library for Rust
emu gpgpu gpu gpu-acceleration gpu-computing gpu-programming rust
Last synced: 14 May 2025
https://calebwin.github.io/emu/
The write-once-run-anywhere GPGPU library for Rust
emu gpgpu gpu gpu-acceleration gpu-computing gpu-programming rust
Last synced: 30 Apr 2025
https://github.com/emi-group/evox
Distributed GPU-Accelerated Framework for Evolutionary Computation. Comprehensive Library of Evolutionary Algorithms & Benchmark Problems.
black-box-optimization brax derivative-free-optimization evolutionary-algorithms evolutionary-computation evolutionary-optimization evolutionary-reinforcement-learinig evolutionary-strategies gpu-acceleration gradient-free-optimization gym jax metaheuristics multi-objective-optimization neuroevolution population-based-optimization pytorch ray
Last synced: 06 Nov 2025
https://github.com/beehive-lab/tornadovm
TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
ai cuda gpu-acceleration gpu-computing gpus graalvm java levelzero multi-core opencl parallel-computing parallel-programming spirv
Last synced: 02 Dec 2025
https://github.com/NVIDIA/cccl
CUDA Core Compute Libraries
accelerated-computing cpp cpp-programming cuda cuda-cpp cuda-kernels cuda-library cuda-programming gpu gpu-acceleration gpu-computing gpu-programming hpc modern-cpp nvidia nvidia-gpu parallel-algorithm parallel-computing parallel-programming
Last synced: 14 May 2025
https://github.com/beehive-lab/TornadoVM
TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
ai cuda gpu-acceleration gpu-computing gpus graalvm java levelzero multi-core opencl parallel-computing parallel-programming spirv
Last synced: 04 Apr 2025
https://github.com/stotko/stdgpu
stdgpu: Efficient STL-like Data Structures on the GPU
cpp cpp17 cpp20 cuda data-structures gpgpu gpu gpu-acceleration gpu-computing hip modern-cpp openmp rocm stl stl-containers stl-like
Last synced: 14 May 2025
https://github.com/jaysmito101/terraforge3d
Cross Platform Professional Procedural Terrain Generation & Texturing Tool
3d cpp game-development gamedev gpu-acceleration hacktoberfest imgui nodeeditor open-source opengl opensource precedural-textures procedural-generation terrain-generation
Last synced: 16 May 2025
https://jaysmito101.github.io/TerraForge3D/
Cross Platform Professional Procedural Terrain Generation & Texturing Tool
3d cpp game-development gamedev gpu-acceleration hacktoberfest imgui nodeeditor open-source opengl opensource precedural-textures procedural-generation terrain-generation
Last synced: 07 May 2025
https://github.com/liu-xiandong/how_to_optimize_in_gpu
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
elementwise gpu-acceleration high-performance-computing hpc reduce sgemm sgemv
Last synced: 03 Oct 2025
https://github.com/NVIDIA-Merlin/HugeCTR
HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
cpp deep-learning gpu-acceleration recommendation-system recommender-system
Last synced: 20 Jul 2025
https://github.com/nvidia-merlin/hugectr
HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
cpp deep-learning gpu-acceleration recommendation-system recommender-system
Last synced: 14 May 2025
https://github.com/chelsea0x3b/cudarc
Safe rust wrapper around CUDA toolkit
cublas cuda cuda-kernels cuda-programming cuda-toolkit cudnn curand gpu gpu-acceleration nccl nvrtc rust
Last synced: 03 Jan 2026
https://github.com/hughperkins/VeriGPU
OpenSource GPU, in Verilog, loosely based on RISC-V ISA
asic-design gpu gpu-acceleration hardware-designs machine-learning risc-v risc-v-assembly verification verilog
Last synced: 02 Apr 2025
https://github.com/Jaysmito101/TerraForge3D
Cross Platform Professional Procedural Terrain Generation & Texturing Tool
3d cpp game-development gamedev gpu-acceleration hacktoberfest imgui nodeeditor open-source opengl opensource precedural-textures procedural-generation terrain-generation
Last synced: 01 Apr 2025
https://github.com/dgasmith/opt_einsum
⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.
contraction einsum gpu-acceleration performance python tensor tensor-contraction
Last synced: 13 May 2025
https://github.com/hughperkins/verigpu
OpenSource GPU, in Verilog, loosely based on RISC-V ISA
asic-design gpu gpu-acceleration hardware-designs machine-learning risc-v risc-v-assembly verification verilog
Last synced: 18 Oct 2025
https://github.com/coreylowman/cudarc
Safe rust wrapper around CUDA toolkit
cublas cuda cuda-kernels cuda-programming cuda-toolkit cudnn curand gpu gpu-acceleration nccl nvrtc rust
Last synced: 14 May 2025
https://github.com/eszdman/PhotonCamera
Android Camera that uses Enhanced image processing
android android-camera camera2-api computational-photography computer-vision gpu-acceleration image-processing photography
Last synced: 27 Mar 2025
https://github.com/Liu-xiandong/How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
elementwise gpu-acceleration high-performance-computing hpc reduce sgemm sgemv
Last synced: 14 May 2025
https://github.com/nvidia-merlin/merlin
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
deep-learning end-to-end gpu-acceleration machine-learning recommendation-system recommender-system
Last synced: 13 Apr 2025
https://github.com/NVIDIA-Merlin/Merlin
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
deep-learning end-to-end gpu-acceleration machine-learning recommendation-system recommender-system
Last synced: 30 Jul 2025
https://github.com/iot-salzburg/gpu-jupyter
GPU-Jupyter: Your GPU-accelerated JupyterLab with a rich data science toolstack, TensorFlow and PyTorch for your reproducible deep learning experiments.
docker environment gpu-acceleration gpu-computing jupyter jupyter-server jupyterlab pytorch reproducible-research tensorflow
Last synced: 15 May 2025
https://github.com/ttddee/Cascade
Node-based image editor with GPU-acceleration.
gpu-acceleration image-editor node-based vulkan
Last synced: 01 Apr 2025
https://github.com/limbo018/DREAMPlace
Deep learning toolkit-enabled VLSI placement
deep-learning gpu-acceleration pytorch vlsi vlsi-physical-design vlsi-placement
Last synced: 08 May 2025
https://github.com/sergio0694/neuralnetwork.net
A TensorFlow-inspired neural network library built from scratch in C# 7.3 for .NET Standard 2.0, with GPU support through cuDNN
ai backpropagation-algorithm classification-algorithims cnn convolutional-neural-networks csharp cuda gpu-acceleration gradient-descent machine-learning net-framework netstandard neural-network supervised-learning visual-studio
Last synced: 16 May 2025
https://github.com/Sergio0694/NeuralNetwork.NET
A TensorFlow-inspired neural network library built from scratch in C# 7.3 for .NET Standard 2.0, with GPU support through cuDNN
ai backpropagation-algorithm classification-algorithims cnn convolutional-neural-networks csharp cuda gpu-acceleration gradient-descent machine-learning net-framework netstandard neural-network supervised-learning visual-studio
Last synced: 02 Apr 2025
https://github.com/philferriere/dlwin
GPU-accelerated Deep Learning on Windows 10 native
cntk cudnn deep-learning gpu-acceleration gpu-mode keras tensorflow theano
Last synced: 05 Apr 2025
https://github.com/DavidDiazGuerra/gpuRIR
Python library for Room Impulse Response (RIR) simulation with GPU acceleration
acoustics gpu-acceleration image-source-model python-library rir room-impulse-responses
Last synced: 01 Apr 2025
https://github.com/megviirobot/megba
MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment
bundleadjustment cuda distributed gpu-acceleration graph-optimization high-performance
Last synced: 24 Jun 2025
https://github.com/MegviiRobot/MegBA
MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment
bundleadjustment cuda distributed gpu-acceleration graph-optimization high-performance
Last synced: 07 May 2025
https://github.com/projectphysx/opencl-wrapper
OpenCL is the most powerful programming language ever created. Yet the OpenCL C++ bindings are cumbersome and the code overhead prevents many people from getting started. I created this lightweight OpenCL-Wrapper to greatly simplify OpenCL software development with C++ while keeping functionality and performance.
gpgpu gpgpu-computing gpu gpu-acceleration gpu-computing gpu-programming opencl vector-processor vectorization
Last synced: 16 May 2025
https://github.com/uncomplicate/bayadera
High-performance Bayesian Data Analysis on the GPU in Clojure
bayesian bayesian-data-analysis bayesian-inference clojure clojure-library cuda gpu gpu-acceleration gpu-computing high-performance-computing machine-learning markov-chain-monte-carlo mcmc opencl statistics
Last synced: 09 Apr 2025
https://github.com/datacanvasio/hypergbm
A full pipeline AutoML tool for tabular data
adversarial-validation automl catboost dask dask-distributed datacleaning distributed-training ensemble-learning fullpipeline gbm gpu-acceleration lightgbm preprocessing pseudo-labeling rapidsai semi-supervised-learning sklearn tabular-data xgboost
Last synced: 15 May 2025
https://github.com/DataCanvasIO/HyperGBM
A full pipeline AutoML tool for tabular data
adversarial-validation automl catboost dask dask-distributed datacleaning distributed-training ensemble-learning fullpipeline gbm gpu-acceleration lightgbm preprocessing pseudo-labeling rapidsai semi-supervised-learning sklearn tabular-data xgboost
Last synced: 09 May 2025
https://github.com/andrewmilson/ministark
🏃♂️💨 GPU accelerated STARK prover built on @arkworks-rs
apple-silicon arkworks arkworks-rs crypto cryptography fft finite-fields gpu gpu-acceleration gpu-computing gpu-programming m1 metal optimization polynomials rust starks virtual-machine zero-knowledge zkstarks
Last synced: 12 Dec 2025
https://github.com/favreau/Sol-R
Open-Source CUDA/OpenCL Speed Of Light Ray-tracer
3d 3d-graphics-engine cuda gpgpu gpu-acceleration gpu-computing graphics-engine interactive opencl path-tracing pathtracing ray-tracing raytracer raytracing raytracing-engine realtime-rendering rendering science virtual-reality vr
Last synced: 30 Apr 2025
https://github.com/quiver-team/torch-quiver
PyTorch Library for Low-Latency, High-Throughput Graph Learning on GPUs.
distributed-computing geometric-deep-learning gpu-acceleration graph-learning graph-neural-networks pytorch
Last synced: 04 Apr 2025
https://github.com/baggepinnen/montecarlomeasurements.jl
Propagation of distributions by Monte-Carlo sampling: Real number types with uncertainty represented by samples.
error-analysis error-propagation gpu-acceleration gpu-computing monte-carlo monte-carlo-sampling monte-carlo-simulation numeric-types particle-filter physical-quantities probability-distributions robust-optimization uncertainties uncertainty-propagation uncertainty-sampling
Last synced: 15 Mar 2025
https://github.com/baggepinnen/MonteCarloMeasurements.jl
Propagation of distributions by Monte-Carlo sampling: Real number types with uncertainty represented by samples.
error-analysis error-propagation gpu-acceleration gpu-computing monte-carlo monte-carlo-sampling monte-carlo-simulation numeric-types particle-filter physical-quantities probability-distributions robust-optimization uncertainties uncertainty-propagation uncertainty-sampling
Last synced: 27 Mar 2025
https://github.com/nvidia/cuopt-resources
A collection of NVIDIA cuOpt samples and other resources
cvrp cvrptw gpu gpu-acceleration gpu-optimization intralogistics last-mile-delivery logistics nvidia-gpu operations-research optimization optimization-algorithms optimization-tools pickup-and-delivery route-optimization traveling-salesman-problem tsp-solver vehicle-routing-problem vrp vrp-solver
Last synced: 04 Apr 2025
https://github.com/marian-nmt/marian-dev
Fast Neural Machine Translation in C++ - development repository
cpp11 cuda fast gpu-acceleration neural-machine-translation
Last synced: 15 May 2025
https://github.com/AdrianAntico/AutoQuant
R package for automation of machine learning, forecasting, model evaluation, and model interpretation
automated-machine-learning automl catboost classification gpu-acceleration h2o lightgbm multiclass-classification panel-data r regression supervised-learning timeseries unsupervised-learning xgboost
Last synced: 26 Apr 2025
https://github.com/AdrianAntico/RemixAutoML
R package for automation of machine learning, forecasting, model evaluation, and model interpretation
automated-machine-learning automl catboost classification gpu-acceleration h2o lightgbm multiclass-classification panel-data r regression supervised-learning timeseries unsupervised-learning xgboost
Last synced: 14 Mar 2025
https://github.com/clesperanto/pyclesperanto_prototype
GPU-accelerated bio-image analysis focusing on 3D+t microscopy image data
bioimage-analysis gpu-acceleration microscopy
Last synced: 21 Oct 2025
https://github.com/denosaurs/netsaur
Powerful Powerful Machine Learning library with GPU, CPU and WASM backends
ai artificial-intelligence deep-learning deep-neural-networks deno edge-computing gpu-acceleration gpu-computing hacktoberfest machine-learning ml neural-network rust safetensors serverless typescript wasm webassembly webgpu
Last synced: 04 Apr 2025
https://github.com/audiokit/waveform
GPU accelerated waveform view
audio audio-visualizer gpu-acceleration metal
Last synced: 17 Jun 2025
https://github.com/bh107/bohrium
Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX
cuda gpu gpu-acceleration multi-core numpy opencl parallel-computing
Last synced: 21 Oct 2025
https://github.com/clEsperanto/pyclesperanto_prototype
GPU-accelerated bio-image analysis focusing on 3D+t microscopy image data
bioimage-analysis gpu-acceleration microscopy
Last synced: 21 Mar 2025
https://github.com/BasBuller/PySNN
Efficient Spiking Neural Network framework, built on top of PyTorch for GPU acceleration
deep-learning dynamic gpu-acceleration gpu-computing machine-learning neural-networks python3 pytorch spiking-neural-networks stdp
Last synced: 07 May 2025
https://github.com/AudioKit/Waveform
GPU accelerated waveform view
audio audio-visualizer gpu-acceleration metal
Last synced: 16 Jul 2025
https://github.com/rocm/tensile
Stretching GPU performance for GEMMs and tensor contractions.
amd assembly auto-tuning blas dnn gemm gpu gpu-acceleration gpu-computing hip machine-learning matrix-multiplication neural-networks opencl python radeon tensor-contraction tensors
Last synced: 05 Apr 2025
https://github.com/ROCm/Tensile
Stretching GPU performance for GEMMs and tensor contractions.
amd assembly auto-tuning blas dnn gemm gpu gpu-acceleration gpu-computing hip machine-learning matrix-multiplication neural-networks opencl python radeon tensor-contraction tensors
Last synced: 23 Jul 2025
https://github.com/uncomplicate/clojurecuda
Clojure library for CUDA development
clojure clojure-library cuda cuda-development gpu-acceleration gpu-computing high-performance java
Last synced: 16 May 2025
https://github.com/peculiarventures/gammacv
GammaCV is a WebGL accelerated Computer Vision library for browser
computer-vision feature-extraction gpu gpu-acceleration image-analysis image-processing machine-learning machine-vision object-detection opencv webgl
Last synced: 08 Apr 2025
https://github.com/yzhao062/pytod
TOD: GPU-accelerated Outlier Detection via Tensor Operations
anomaly-detection gpu-acceleration gpu-systems machine-learning outlier-detection unsupervised-learning
Last synced: 23 Oct 2025
https://github.com/eth-cscs/cosma
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
communication-optimal cuda gpu-acceleration linear-algebra matmul matrix-multiplication mpi pdgemm rocm scalapack
Last synced: 04 Apr 2025
https://github.com/merzlab/QUICK
QUICK: A GPU-enabled ab intio quantum chemistry software package
chemistry computational-chemistry cuda density-functional-theory electronic-structure-calculations gpu gpu-acceleration hartree-fock parallel-computing quantum-chemistry
Last synced: 09 Jul 2025
https://github.com/ucl-bug/jwave
A JAX-based research framework for differentiable and parallelizable acoustic simulations, on CPU, GPUs and TPUs
acoustics differentiable-simulations gpu gpu-acceleration jax kwave physics-informed-neural-networks scientific-machine-learning simulation tpu-acceleration ultrasound wave-equation
Last synced: 14 May 2025
https://github.com/juliahealth/komamri.jl
Koma is a Pulseq-compatible framework to efficiently simulate Magnetic Resonance Imaging (MRI) acquisitions. The main focus of this package is to simulate general scenarios that could arise in pulse sequence development.
cardiac diffusion diffusion-mri gpu-acceleration mri simulation
Last synced: 11 Dec 2025
https://github.com/ysh329/opencl-101
Learn OpenCL step by step.
gpu-acceleration gpu-programming guides opencl scratch tutorial-code tutorials
Last synced: 13 Apr 2025
https://github.com/mightycow/Sluggish
Toy CPU and GPU implementations of the Slug rendering algorithm
font gpu-acceleration rendering-algorithm slug
Last synced: 16 Oct 2025
https://github.com/arceryz/raylib-gpu-particles
Raylib 100% GPU particles example in 3D. Uses compute shaders and is fully documented. Millions of particles at 60 fps on a laptop.
c compute-shader example glsl gpu gpu-acceleration gui lorenz-attractor raygui raylib raylib-examples tutorial
Last synced: 11 Apr 2025
https://github.com/tianzerl/pyanime4k
An easy way to use anime4k in python
anime anime4k anime4kcpp gpu-acceleration machine-learning python python3 upscale upscaling video-processing
Last synced: 12 Apr 2025
https://github.com/icl-utk-edu/slate
SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) systems. It is developed as part of the U.S. Department of Energy Exascale Computing Project (ECP).
gpu-acceleration hpc linear-algebra
Last synced: 29 Dec 2025
https://github.com/mitmath/juliacomputation
Repository for Common Ground C25
automatic-differentiation climate-science gpu-acceleration image-processing julia julia-language machine-learning optimization pluto-notebooks
Last synced: 30 Oct 2025
https://github.com/Heteroflow/Heteroflow
Concurrent CPU-GPU Programming using Task Models
cpu-gpu-scheduling cuda gpu gpu-acceleration gpu-computing gpu-programming heterogeneous-computing heterogeneous-parallel-programming heterogeneous-systems multithreaded multithreading task-parallelism
Last synced: 01 Apr 2025
https://github.com/ucbrise/piranha
Piranha: A GPU Platform for Secure Computation
gpu-acceleration multi-party-computation privacy-preserving-machine-learning
Last synced: 22 Jun 2025
https://github.com/ashvardanian/parallelreductionsbenchmark
Thrust, CUB, TBB, AVX2, AVX-512, CUDA, OpenCL, OpenMP, Metal - all it takes to sum a lot of numbers fast!
apple avx512 cuda glsl gpgpu gpu gpu-acceleration gpu-computing hpc intel metal nvidia opencl openmp parallel simd stl tbb thrust
Last synced: 06 Apr 2025
https://github.com/guillaume-chevalier/glove-as-a-tensorflow-embedding-layer
Taking a pretrained GloVe model, and using it as a TensorFlow embedding weight layer **inside the GPU**. Therefore, you only need to send the index of the words through the GPU data transfer bus, reducing data transfer overhead.
cosine-similarity glove glove-embeddings gpu gpu-acceleration gpu-tensorflow neural-network tensorflow tensorflow-layers word-embeddings word2vec
Last synced: 30 Apr 2025
https://github.com/slai-labs/get-beam
Run GPU inference and training jobs on serverless infrastructure that scales with you.
artificial-intelligence cloud-computing cost-optimization data-science deep-learning distributed-computing gpu-acceleration gpu-computing hpc llm-serving llm-training machine-learning ml-infrastructure mlops python serverless serverless-architectures
Last synced: 18 Apr 2025
https://github.com/larsgeb/m1-gpu-cpp
Metal Shading Language on Apple M1's GPU for scientific C++.
clang cpp cpp17 gpu-acceleration gpu-computing m1-mac metal metal-cpp objective-c scientific-computing
Last synced: 17 Mar 2025
https://github.com/juliateachingctu/scientific-programming-in-julia
Repository for B0M36SPJ
automatic-differentiation differential-equations gpu-acceleration julia julia-language julialang meta-programming parallel-programming
Last synced: 06 Apr 2025
https://github.com/aestream/aestream
Efficient streaming of sparse event data supporting files, network I/O, GPU peripherals (via Torch/Jax/Numpy) and neuromorphic protocols
coroutines event-camera gpu-acceleration neuromorphic pytorch
Last synced: 19 Nov 2025
https://paragroup.github.io/WindFlow/
A C++17 Data Stream Processing Parallel Library for Multicores and GPUs
cuda gpu gpu-acceleration gpu-computing gpu-programming multi-core multicore multithreading parallel-computing parallel-patterns parallel-programming parallelism sliding-windows stream stream-api stream-processing streaming streaming-api streaming-data streams
Last synced: 14 May 2025
https://github.com/oalieno/asm2vec-pytorch
Unofficial implementation of asm2vec using pytorch ( with GPU acceleration )
asm2vec gpu-acceleration machine-learning neural-language-processing python pytorch unofficial
Last synced: 10 May 2025
https://github.com/kunitoki/yup
YUP is an open-source library dedicated to empowering developers with advanced tools for cross-platform application development.
application-framework audio gpu-acceleration graphics gui juce rive
Last synced: 08 May 2025
https://github.com/selkies-project/selkies-vdi
WebRTC & Xpra desktops on Selkies
gke gpu-acceleration kubernetes selkies vdi webrtc xpra
Last synced: 23 Jul 2025
https://github.com/brian-team/brian2cuda
A brian2 extension to simulate spiking neural networks on GPUs
biological-simulations brian brian2 code-generation computational-neuroscience differential-equations gpu gpu-acceleration neuroscience python science simulation simulation-framework spiking-neural-networks
Last synced: 05 Apr 2025
https://github.com/maksyuki/TaichiGAME
GPU Accelerated Motion Engine based on Taichi Lang.
gpu-acceleration motion motion-control motion-planning physics-engine physics-simulation python3 simulation taichi
Last synced: 02 Apr 2025
https://github.com/maksyuki/taichigame
GPU Accelerated Motion Engine based on Taichi Lang.
gpu-acceleration motion motion-control motion-planning physics-engine physics-simulation python3 simulation taichi
Last synced: 10 Apr 2025