An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by ModelTC

A curated list of projects in awesome lists by ModelTC .

https://github.com/modeltc/lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

deep-learning gpt llama llm model-serving nlp openai-triton

Last synced: 13 May 2025

https://github.com/ModelTC/lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

deep-learning gpt llama llm model-serving nlp openai-triton

Last synced: 20 Mar 2025

https://github.com/ModelTC/MQBench

Model Quantization Benchmark

Last synced: 20 Nov 2025

https://github.com/modeltc/mqbench

Model Quantization Benchmark

Last synced: 15 May 2025

https://github.com/modeltc/qwen-image-lightning

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Last synced: 06 Sep 2025

https://github.com/ModelTC/llmc

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

awq benchmark deployment evaluation internlm2 large-language-models lightllm llama3 llm lvlm mixtral omniquant post-training-quantization pruning quantization quarot smoothquant spinquant tool vllm

Last synced: 23 Apr 2025

https://github.com/ModelTC/United-Perception

United Perception

Last synced: 20 Mar 2025

https://github.com/modeltc/united-perception

United Perception

Last synced: 08 Oct 2025

https://github.com/modeltc/dipoorlet

Offline Quantization Tools for Deploy.

Last synced: 20 Jun 2025

https://github.com/modeltc/tfmq-dm

[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".

cvpr cvpr2024 ddim diffusion-models highlight ldm post-training-quantization quantization stable-diffusion

Last synced: 04 Apr 2025

https://github.com/modeltc/outlier_suppression_plus

Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling

Last synced: 04 Apr 2025

https://github.com/ModelTC/EasyLLM

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Last synced: 12 May 2025

https://github.com/modeltc/easyllm

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Last synced: 04 Apr 2025

https://github.com/modeltc/rank_dataset

PyTorch Dataset Rank Dataset

Last synced: 04 Apr 2025

https://github.com/modeltc/harmonica

[ICML 2025] This is the official PyTorch implementation of "HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".

acceleration diffusion-models diffusion-transformer dit feature-caching icml icml-2025 pixart pixart-sigma

Last synced: 13 Aug 2025

https://github.com/modeltc/nart

NART = NART is not A RunTime, a deep learning inference framework.

Last synced: 04 Apr 2025

https://github.com/modeltc/nnlqp

Last synced: 14 Jul 2025

https://github.com/modeltc/qllm

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

llama llama2 llm post-training-quantization pytorch quantization transformers

Last synced: 03 Aug 2025

https://github.com/modeltc/llmc

llmc is an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.

benchmark deployment evaluation large-language-models llm pruning quantization tool

Last synced: 04 Apr 2025

https://github.com/modeltc/omnibal

Last synced: 04 Apr 2025

https://github.com/modeltc/pyvlova

Yet another Polyhedra Compiler for DeepLearning

Last synced: 21 Jul 2025

https://github.com/modeltc/comfyui-lightx2vwrapper

ComfyUI custom node for lightx2v

comfyui comfyui-nodes

Last synced: 09 Oct 2025

https://github.com/modeltc/lightx2v

Light Video Generation Inference Framework

diffusion-models hunyuan-video video-generation wan-video

Last synced: 23 Oct 2025

https://github.com/modeltc/prototype

Last synced: 25 Jun 2025

https://github.com/modeltc/aaai2023_eampd

AAAI2023 Efficient and Accurate Models towards Practical Deep Learning Baseline

Last synced: 21 Jan 2026

https://github.com/modeltc/msbench

A tool for model sparse based on torch.fx

Last synced: 11 Apr 2025

https://github.com/modeltc/general-sam

A general suffix automaton implementation in Rust with Python bindings

Last synced: 10 Sep 2025

https://github.com/modeltc/imagenet-s

Robustness for real-world system noise

Last synced: 01 Dec 2025

https://github.com/modeltc/general-sam-py

Python bindings for general-sam and some utilities

Last synced: 04 Apr 2025

https://github.com/modeltc/mtc-token-healing

Token healing implementation in Rust

Last synced: 04 Apr 2025

https://github.com/modeltc/fcpts

Last synced: 09 Apr 2025

https://github.com/modeltc/pyrotom

Python Code Hotfix and Refactor on the fly

Last synced: 29 Oct 2025

https://github.com/modeltc/statecs

Last synced: 10 Jan 2026

https://github.com/modeltc/llm_qat

Last synced: 07 May 2025

https://github.com/modeltc/unrt

UNiversal RunTime

Last synced: 03 Feb 2026

https://github.com/modeltc/greedy-tokenizer

Greedily tokenize strings with the longest tokens iteratively.

Last synced: 25 Oct 2025

https://github.com/modeltc/tvm-vit

Last synced: 19 Jan 2026