Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists by jundaf2
A curated list of projects in awesome lists by jundaf2 .
https://github.com/jundaf2/CUDA-INT8-GEMM
CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API
Last synced: 19 Nov 2024
https://github.com/jundaf2/cuda-int8-gemm
CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API
Last synced: 15 Nov 2024
https://github.com/jundaf2/eigenmha
Forward and backward Attention DNN operators implementationed by LibTorch, cuDNN, and Eigen.
backpropagation cuda cudnn cudnn-v8 dnn inference pytorch
Last synced: 15 Nov 2024
https://github.com/jundaf2/time-domain-cem
The only known (by 2022) open-source, easy-to-understand basic algorithm implementations in TD-CEM. (Please star and fork this project if you find it useful!)
Last synced: 15 Nov 2024
https://github.com/jundaf2/finite-element-domain-decomposition
Overlapping Schwarz Domain Decomposition Finite Element Algorithm in both Matlab and serial/parallel C++
Last synced: 15 Nov 2024
https://github.com/jundaf2/gpu-tensor-permute
permute sequence data on GPU with high bandwidth
cuda gpu-acceleration sequence-to-sequence
Last synced: 15 Nov 2024
https://github.com/jundaf2/regex-gpu
Block-optimized regular expression matching engine on GPU
Last synced: 15 Nov 2024
https://github.com/jundaf2/cutlass-kernel-volta-gemm
volta fp16 gemm kernel
Last synced: 16 Jan 2025
https://github.com/jundaf2/adaptive-filtering-algorithms
Adaptive Algorithms
Last synced: 16 Jan 2025
https://github.com/jundaf2/method-of-moments
method of moments (MoM) for conducting wire problem
Last synced: 16 Jan 2025
https://github.com/jundaf2/cutlass-b2bgemm
an extension to the cutlass half-precision b2b gemm example
Last synced: 16 Jan 2025
https://github.com/jundaf2/fetd-mfem
A simple Finite element time domain example built with MFEM
Last synced: 16 Jan 2025
https://github.com/jundaf2/gpu-gym
a toy used for keeping all gpus on a machine busy using nccl
Last synced: 16 Jan 2025
https://github.com/jundaf2/gpu-philox
cuda philox in a single kernel (easily used in fusion)
Last synced: 16 Jan 2025
https://github.com/jundaf2/regex-cpu
A simple and naive CPU version of NFA-based Regex matching to play with.
Last synced: 16 Jan 2025
https://github.com/jundaf2/fp8_basic_op
a tiny library for FP8 basic operations
Last synced: 16 Jan 2025
https://github.com/jundaf2/heterogeneous-gpus
Heterogeneous Nvidia (CUDA) and Intel (OpenCL) GPU Programming
Last synced: 16 Jan 2025