https://github.com/mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://github.com/mlc-ai/mlc-llm

language-model llm machine-learning-compilation tvm

Last synced: about 1 month ago
JSON representation

Universal LLM Deployment Engine with ML Compilation

Host: GitHub
URL: https://github.com/mlc-ai/mlc-llm
Owner: mlc-ai
License: apache-2.0
Created: 2023-04-29T01:59:25.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2025-05-01T13:50:09.000Z (about 2 months ago)
Last Synced: 2025-05-06T11:33:47.725Z (about 1 month ago)
Topics: language-model, llm, machine-learning-compilation, tvm
Language: Python
Homepage: https://llm.mlc.ai/
Size: 33.5 MB
Stars: 20,554
Watchers: 183
Forks: 1,715
Open Issues: 268
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-local-llms - mlc-llm
awesome-llamas - mlc-llm - Running LLaMA2 on iOS devices natively using GPU acceleration, see [example](https://twitter.com/bohanhou1998/status/1681682445937295360) (Libraries)
Awesome-LLM-Productization - mlc-llm - Enable everyone to develop, optimize and deploy AI models natively on everyone's devices. (Models and Tools / LLM Deployment)
awesome - mlc-ai/mlc-llm - Universal LLM Deployment Engine with ML Compilation (Python)
awesome-ai - MLC LLM - ai/mlc-llm?style=social) - 代表了一种新的思路，serverless，允许在手机、电脑等终端上直接运行 LLM (LLM)
awesome - mlc-ai/mlc-llm - Universal LLM Deployment Engine with ML Compilation (Python)
awesome - mlc-ai/mlc-llm - Universal LLM Deployment Engine with ML Compilation (Python)
ai-game-devtools - MLC LLM
StarryDivineSky - mlc-ai/mlc-llm
awesome-llm-and-aigc - MLC LLM - ai/mlc-llm?style=social"/> : Universal LLM Deployment Engine with ML Compilation. [llm.mlc.ai/](https://llm.mlc.ai/) (Summary)
awesome-llm-and-aigc - MLC LLM - ai/mlc-llm?style=social"/> : Enable everyone to develop, optimize and deploy AI models natively on everyone's devices. [mlc.ai/mlc-llm](https://mlc.ai/mlc-llm/) (Summary)
awesome-cuda-and-hpc - MLC LLM - ai/mlc-llm?style=social"/> : Universal LLM Deployment Engine with ML Compilation. [llm.mlc.ai/](https://llm.mlc.ai/) (Frameworks)
awesome-cuda-and-hpc - MLC LLM - ai/mlc-llm?style=social"/> : Enable everyone to develop, optimize and deploy AI models natively on everyone's devices. [mlc.ai/mlc-llm](https://mlc.ai/mlc-llm/) (Frameworks)
AiTreasureBox - mlc-ai/mlc-llm - 06-19_20823_5](https://img.shields.io/github/stars/mlc-ai/mlc-llm.svg) |Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.| (Repos)
Awesome-LLMs-on-device - [Github
Awesome-LLMOps - MLC LLM - ai/mlc-llm.svg?style=flat&color=green) ![Contributors](https://img.shields.io/github/contributors/mlc-ai/mlc-llm?color=green) ![LastCommit](https://img.shields.io/github/last-commit/mlc-ai/mlc-llm?color=green) (Inference / Inference Engine)
Awesome-LLMOps - MLC LLM - ai/mlc-llm.svg) | ![Release](https://img.shields.io/github/release/mlc-ai/mlc-llm) | ![Contributors](https://img.shields.io/github/contributors/mlc-ai/mlc-llm) | Universal LLM Deployment Engine with ML Compilation | | (Inference)
stars - mlc-ai/mlc-llm - Universal LLM Deployment Engine with ML Compilation (Python)
awesome-production-machine-learning - MLC LLM - ai/mlc-llm.svg?style=social) - MLC LLM is a universal solution that allows any language models to be deployed natively on a diverse set of hardware backends and native applications, plus a productive framework for everyone to further optimize model performance for their own use cases. (Industry Strength Natural Language Processing)

README

        


# MLC LLM

[![Installation](https://img.shields.io/badge/docs-latest-green)](https://llm.mlc.ai/docs/)

[![License](https://img.shields.io/badge/license-apache_2-blue)](https://github.com/mlc-ai/mlc-llm/blob/main/LICENSE)

[![Join Discoard](https://img.shields.io/badge/Join-Discord-7289DA?logo=discord&logoColor=white)](https://discord.gg/9Xpy2HGBuD)

[![Related Repository: WebLLM](https://img.shields.io/badge/Related_Repo-WebLLM-fafbfc?logo=github)](https://github.com/mlc-ai/web-llm/)

**Universal LLM Deployment Engine with ML Compilation**

[Get Started](https://llm.mlc.ai/docs/get_started/quick_start) | [Documentation](https://llm.mlc.ai/docs) | [Blog](https://blog.mlc.ai/)



## About

MLC LLM is a machine learning compiler and high-performance deployment engine for large language models.  The mission of this project is to enable everyone to develop, optimize, and deploy AI models natively on everyone's platforms. 



  

    

       

      AMD GPU

      NVIDIA GPU

      Apple GPU

      Intel GPU

    

  

  

    

      Linux / Win

      ✅ Vulkan, ROCm

      ✅ Vulkan, CUDA

      N/A

      ✅ Vulkan

    

    

      macOS

      ✅ Metal (dGPU)

      N/A

      ✅ Metal

      ✅ Metal (iGPU)

    

    

      Web Browser

      ✅ WebGPU and WASM 

    

    

      iOS / iPadOS

      ✅ Metal on Apple A-series GPU

    

    

      Android

      ✅ OpenCL on Adreno GPU

      ✅ OpenCL on Mali GPU

    

  



MLC LLM compiles and runs code on MLCEngine -- a unified high-performance LLM inference engine across the above platforms. MLCEngine provides OpenAI-compatible API available through REST server, python, javascript, iOS, Android, all backed by the same engine and compiler that we keep improving with the community.

## Get Started

Please visit our [documentation](https://llm.mlc.ai/docs/) to get started with MLC LLM.

- [Installation](https://llm.mlc.ai/docs/install/mlc_llm)

- [Quick start](https://llm.mlc.ai/docs/get_started/quick_start)

- [Introduction](https://llm.mlc.ai/docs/get_started/introduction)

## Citation

Please consider citing our project if you find it useful:

```bibtex

@software{mlc-llm,

    author = {{MLC team}},

    title = {{MLC-LLM}},

    url = {https://github.com/mlc-ai/mlc-llm},

    year = {2023-2025}

}

```

The underlying techniques of MLC LLM include:

  References (Click to expand)

  ```bibtex

  @inproceedings{tensorir,

      author = {Feng, Siyuan and Hou, Bohan and Jin, Hongyi and Lin, Wuwei and Shao, Junru and Lai, Ruihang and Ye, Zihao and Zheng, Lianmin and Yu, Cody Hao and Yu, Yong and Chen, Tianqi},

      title = {TensorIR: An Abstraction for Automatic Tensorized Program Optimization},

      year = {2023},

      isbn = {9781450399166},

      publisher = {Association for Computing Machinery},

      address = {New York, NY, USA},

      url = {https://doi.org/10.1145/3575693.3576933},

      doi = {10.1145/3575693.3576933},

      booktitle = {Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2},

      pages = {804–817},

      numpages = {14},

      keywords = {Tensor Computation, Machine Learning Compiler, Deep Neural Network},

      location = {Vancouver, BC, Canada},

      series = {ASPLOS 2023}

  }

  @inproceedings{metaschedule,

      author = {Shao, Junru and Zhou, Xiyou and Feng, Siyuan and Hou, Bohan and Lai, Ruihang and Jin, Hongyi and Lin, Wuwei and Masuda, Masahiro and Yu, Cody Hao and Chen, Tianqi},

      booktitle = {Advances in Neural Information Processing Systems},

      editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},

      pages = {35783--35796},

      publisher = {Curran Associates, Inc.},

      title = {Tensor Program Optimization with Probabilistic Programs},

      url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/e894eafae43e68b4c8dfdacf742bcbf3-Paper-Conference.pdf},

      volume = {35},

      year = {2022}

  }

  @inproceedings{tvm,

      author = {Tianqi Chen and Thierry Moreau and Ziheng Jiang and Lianmin Zheng and Eddie Yan and Haichen Shen and Meghan Cowan and Leyuan Wang and Yuwei Hu and Luis Ceze and Carlos Guestrin and Arvind Krishnamurthy},

      title = {{TVM}: An Automated {End-to-End} Optimizing Compiler for Deep Learning},

      booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)},

      year = {2018},

      isbn = {978-1-939133-08-3},

      address = {Carlsbad, CA},

      pages = {578--594},

      url = {https://www.usenix.org/conference/osdi18/presentation/chen},

      publisher = {USENIX Association},

      month = oct,

  }

  ```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mlc-ai/mlc-llm

Awesome Lists containing this project

README