An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by feifeibear

A curated list of projects in awesome lists by feifeibear .

https://github.com/feifeibear/llmspeculativesampling

Fast inference from large lauguage models via speculative decoding

Last synced: 13 Apr 2025

https://github.com/feifeibear/long-context-attention

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

attention-is-all-you-need deepspeed-ulysses llm-inference llm-training pytorch ring-attention

Last synced: 14 May 2025

https://github.com/feifeibear/odysseus-transformer

Odysseus: Playground of LLM Sequence Parallelism

llm megatron-lm pytorch

Last synced: 22 Nov 2024

https://github.com/feifeibear/swcaffe

A Deep Learning Framework customized for Sunway TaihuLight

caffe deep-learning mpi sunway-taihulight

Last synced: 22 Nov 2024

https://github.com/feifeibear/chituattention

Quantized Attention on GPU

Last synced: 22 Nov 2024

https://github.com/feifeibear/distributed-resnet-tensorflow

A Distributed ResNet on multi-machines each with one GPU card.

distributed imagenet-dataset resnet tensorflow

Last synced: 22 Nov 2024

https://github.com/feifeibear/swdnn

a highly-efficient library for deep neural networks based on Sunway TaihuLight supercomputer.

Last synced: 16 Mar 2025

https://github.com/feifeibear/swgemm

A highly efficient library for GEMM operations on Sunway TaihuLight

Last synced: 22 Nov 2024

https://github.com/feifeibear/pstensor

PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.

cuda deeplearning machinelearning pytorch tensorflow2

Last synced: 16 Mar 2025

https://github.com/feifeibear/pytorchmemtracer

Depict GPU memory footprint during DNN training of PyTorch

dnn memory oom pytorch

Last synced: 22 Nov 2024

https://github.com/feifeibear/llmroofline

Compare different hardware platforms via the Roofline Model for LLM inference tasks.

Last synced: 22 Nov 2024

https://github.com/feifeibear/crack_leetcode

五天刷题,三天模拟!快速掌握leetcode解题套路!

Last synced: 15 Apr 2025

https://github.com/feifeibear/deepspeedzero3benchmark

A finetuned benchmark scripts for DeepSpeed zero3 stage

Last synced: 15 Apr 2025

https://github.com/feifeibear/swdnnv1.0

A Deep Learning Library for Sunway TaihuLight

Last synced: 22 Nov 2024

https://github.com/feifeibear/smo-svm

a python implementation of libsvm

Last synced: 16 Mar 2025

https://github.com/feifeibear/ssh-passwd-free

Method to set passwd-free for a set of IPs

Last synced: 15 Apr 2025

https://github.com/feifeibear/tensorrtbenchmark

Benchmark bert using TensorRT

Last synced: 22 Nov 2024

https://github.com/feifeibear/89757

Last synced: 16 Mar 2025

https://github.com/feifeibear/large-scale-tensorflow-benchmark

benchmark tensorflow for supercomputers

Last synced: 16 Mar 2025

https://github.com/feifeibear/dtensor

Study PyTorch DTensor

Last synced: 16 Mar 2025

https://github.com/feifeibear/commtest

Test for PyTorch Async Collective Communication

Last synced: 16 Mar 2025

https://github.com/feifeibear/dpskv3mfu

Estimate MFU for DeepSeekV3

Last synced: 10 Mar 2025

https://github.com/feifeibear/admm-neuralnetwork

ADMM-NeuralNetwork was implemented by a potato

Last synced: 16 Mar 2025

https://github.com/feifeibear/distributed-compression-dnn

A repo clone from terngrad

Last synced: 16 Mar 2025

https://github.com/feifeibear/dist-tensorflow

Tensorflow test for supercomputer

Last synced: 16 Mar 2025

https://github.com/feifeibear/.dotfile

Last synced: 10 Mar 2025

https://github.com/feifeibear/spark-smo-svm-libsvm

a spark-based SVM training program with SMO method

Last synced: 16 Mar 2025

https://github.com/feifeibear/testgloo

test mpi for gloo

Last synced: 16 Mar 2025

https://github.com/feifeibear/test-ci

Last synced: 16 Mar 2025

https://github.com/feifeibear/89758

Last synced: 16 Mar 2025

https://github.com/feifeibear/spark-smo-svm-ws1

a spark-based SVM training program with SMO method. The working set selection method is of the 1st order!!!

Last synced: 16 Mar 2025

https://github.com/feifeibear/megablocks

A self-maintained version of megablocks (https://github.com/stanford-futuredata/megablocks)

Last synced: 16 Mar 2025