Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists by jundaf2

A curated list of projects in awesome lists by jundaf2 .

https://github.com/jundaf2/CUDA-INT8-GEMM

CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API

Last synced: 19 Nov 2024

https://github.com/jundaf2/cuda-int8-gemm

CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API

Last synced: 15 Nov 2024

https://github.com/jundaf2/eigenmha

Forward and backward Attention DNN operators implementationed by LibTorch, cuDNN, and Eigen.

backpropagation cuda cudnn cudnn-v8 dnn inference pytorch

Last synced: 15 Nov 2024

https://github.com/jundaf2/time-domain-cem

The only known (by 2022) open-source, easy-to-understand basic algorithm implementations in TD-CEM. (Please star and fork this project if you find it useful!)

pde pde-solver

Last synced: 15 Nov 2024

https://github.com/jundaf2/dnn-2d-fdtd

Last synced: 15 Nov 2024

https://github.com/jundaf2/finite-element-domain-decomposition

Overlapping Schwarz Domain Decomposition Finite Element Algorithm in both Matlab and serial/parallel C++

Last synced: 15 Nov 2024

https://github.com/jundaf2/dnn-test-framework

DNN unit test framework

Last synced: 15 Nov 2024

https://github.com/jundaf2/rnn-1d-fdtd

Last synced: 15 Nov 2024

https://github.com/jundaf2/gpu-tensor-permute

permute sequence data on GPU with high bandwidth

cuda gpu-acceleration sequence-to-sequence

Last synced: 15 Nov 2024

https://github.com/jundaf2/regex-gpu

Block-optimized regular expression matching engine on GPU

Last synced: 15 Nov 2024

https://github.com/jundaf2/cutlass-kernel-volta-gemm

volta fp16 gemm kernel

Last synced: 16 Jan 2025

https://github.com/jundaf2/adaptive-filtering-algorithms

Adaptive Algorithms

Last synced: 16 Jan 2025

https://github.com/jundaf2/eigendnn

Last synced: 16 Jan 2025

https://github.com/jundaf2/method-of-moments

method of moments (MoM) for conducting wire problem

Last synced: 16 Jan 2025

https://github.com/jundaf2/cutlass-b2bgemm

an extension to the cutlass half-precision b2b gemm example

Last synced: 16 Jan 2025

https://github.com/jundaf2/fetd-mfem

A simple Finite element time domain example built with MFEM

Last synced: 16 Jan 2025

https://github.com/jundaf2/gpu-gym

a toy used for keeping all gpus on a machine busy using nccl

Last synced: 16 Jan 2025

https://github.com/jundaf2/gpu-philox

cuda philox in a single kernel (easily used in fusion)

Last synced: 16 Jan 2025

https://github.com/jundaf2/mp3-distributed-transaction

UIUC ECE 428 MP3

Last synced: 16 Jan 2025

https://github.com/jundaf2/home

Last synced: 16 Jan 2025

https://github.com/jundaf2/flash-ffn

fast fused feed forward network

Last synced: 16 Jan 2025

https://github.com/jundaf2/regex-cpu

A simple and naive CPU version of NFA-based Regex matching to play with.

Last synced: 16 Jan 2025

https://github.com/jundaf2/fp8_basic_op

a tiny library for FP8 basic operations

Last synced: 16 Jan 2025

https://github.com/jundaf2/mp2-raft

MP Submission

Last synced: 16 Jan 2025

https://github.com/jundaf2/heterogeneous-gpus

Heterogeneous Nvidia (CUDA) and Intel (OpenCL) GPU Programming

Last synced: 16 Jan 2025