{"id":13407349,"url":"https://github.com/mlc-ai/mlc-llm","last_synced_at":"2025-05-13T15:06:43.590Z","repository":{"id":158529280,"uuid":"634081686","full_name":"mlc-ai/mlc-llm","owner":"mlc-ai","description":"Universal LLM Deployment Engine with ML Compilation","archived":false,"fork":false,"pushed_at":"2025-05-01T13:50:09.000Z","size":35177,"stargazers_count":20554,"open_issues_count":268,"forks_count":1715,"subscribers_count":183,"default_branch":"main","last_synced_at":"2025-05-06T11:33:47.725Z","etag":null,"topics":["language-model","llm","machine-learning-compilation","tvm"],"latest_commit_sha":null,"homepage":"https://llm.mlc.ai/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlc-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-04-29T01:59:25.000Z","updated_at":"2025-05-06T08:26:33.000Z","dependencies_parsed_at":"2024-05-19T17:46:49.551Z","dependency_job_id":"21f0511d-9e4d-42c2-b691-d3f84b77219b","html_url":"https://github.com/mlc-ai/mlc-llm","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlc-ai%2Fmlc-llm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlc-ai%2Fmlc-llm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlc-ai%2Fmlc-llm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlc-ai%2Fmlc-llm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlc-ai","download_url":"https://codeload.github.com/mlc-ai/mlc-llm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253501851,"owners_count":21918326,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["language-model","llm","machine-learning-compilation","tvm"],"created_at":"2024-07-30T20:00:38.280Z","updated_at":"2025-05-13T15:06:43.565Z","avatar_url":"https://github.com/mlc-ai.png","language":"Python","funding_links":[],"categories":["Libraries","Models and Tools","Python","others","llm","Project List","A01_文本生成_文本对话","Deployment Stacks","Inference \u0026 Deployment","📱 Mobile \u0026 Edge Deployment","HarmonyOS","LLM inference engines","Summary","Frameworks","Awesome Open-Sourced LLMSys Projects","Repos","Hardware Acceleration and Deployment Strategies","Edge and Retro Inference","Industry Strength Natural Language Processing","🔓 Open Source Inference Engines","Modern Era: Large Models, Agents \u0026 Cognitive Systems on Edge (2023–2026)","Software","Model Serving Frameworks","Networks","Inference","LLM","Open-Source Local LLM Projects","Inference Engines \u0026 Backends (22)","8. Inference Engines","Infrastructure / Deployment of LLMs on Device","3. Inference Engines \u0026 Serving","Local Inference and Serving","🚀 Inference Engines","Quantization, Distillation, and Compression","Model Inference"],"sub_categories":["LLM Deployment","\u003cspan id=\"tool\"\u003eLLM (LLM \u0026 Tool)\u003c/span\u003e","大语言对话模型及数据","Production Inference Servers","Cloud \u0026 Container Deployment","Mobile LLM Solutions","Windows Manager","Distributed Systems","Popular On-Device LLMs Framework","2.2. Edge AI","Memory/Cache Modeling/Analysis","LangManus","Inference Engine","Desktop / Local","Deployment Frameworks","Run locally","LLM \u0026 GenAI Specialized","Quantization libraries"],"readme":"\u003cdiv align=\"center\"\u003e\n\n# MLC LLM\n\n[![Installation](https://img.shields.io/badge/docs-latest-green)](https://llm.mlc.ai/docs/)\n[![License](https://img.shields.io/badge/license-apache_2-blue)](https://github.com/mlc-ai/mlc-llm/blob/main/LICENSE)\n[![Join Discoard](https://img.shields.io/badge/Join-Discord-7289DA?logo=discord\u0026logoColor=white)](https://discord.gg/9Xpy2HGBuD)\n[![Related Repository: WebLLM](https://img.shields.io/badge/Related_Repo-WebLLM-fafbfc?logo=github)](https://github.com/mlc-ai/web-llm/)\n\n**Universal LLM Deployment Engine with ML Compilation**\n\n[Get Started](https://llm.mlc.ai/docs/get_started/quick_start) | [Documentation](https://llm.mlc.ai/docs) | [Blog](https://blog.mlc.ai/)\n\n\u003c/div\u003e\n\n## About\n\nMLC LLM is a machine learning compiler and high-performance deployment engine for large language models.  The mission of this project is to enable everyone to develop, optimize, and deploy AI models natively on everyone's platforms. \n\n\u003cdiv align=\"center\"\u003e\n\u003ctable style=\"width:100%\"\u003e\n  \u003cthead\u003e\n    \u003ctr\u003e\n      \u003cth style=\"width:15%\"\u003e \u003c/th\u003e\n      \u003cth style=\"width:20%\"\u003eAMD GPU\u003c/th\u003e\n      \u003cth style=\"width:20%\"\u003eNVIDIA GPU\u003c/th\u003e\n      \u003cth style=\"width:20%\"\u003eApple GPU\u003c/th\u003e\n      \u003cth style=\"width:24%\"\u003eIntel GPU\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eLinux / Win\u003c/td\u003e\n      \u003ctd\u003e✅ Vulkan, ROCm\u003c/td\u003e\n      \u003ctd\u003e✅ Vulkan, CUDA\u003c/td\u003e\n      \u003ctd\u003eN/A\u003c/td\u003e\n      \u003ctd\u003e✅ Vulkan\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003emacOS\u003c/td\u003e\n      \u003ctd\u003e✅ Metal (dGPU)\u003c/td\u003e\n      \u003ctd\u003eN/A\u003c/td\u003e\n      \u003ctd\u003e✅ Metal\u003c/td\u003e\n      \u003ctd\u003e✅ Metal (iGPU)\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eWeb Browser\u003c/td\u003e\n      \u003ctd colspan=4\u003e✅ WebGPU and WASM \u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eiOS / iPadOS\u003c/td\u003e\n      \u003ctd colspan=4\u003e✅ Metal on Apple A-series GPU\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003eAndroid\u003c/td\u003e\n      \u003ctd colspan=2\u003e✅ OpenCL on Adreno GPU\u003c/td\u003e\n      \u003ctd colspan=2\u003e✅ OpenCL on Mali GPU\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\nMLC LLM compiles and runs code on MLCEngine -- a unified high-performance LLM inference engine across the above platforms. MLCEngine provides OpenAI-compatible API available through REST server, python, javascript, iOS, Android, all backed by the same engine and compiler that we keep improving with the community.\n\n## Get Started\n\nPlease visit our [documentation](https://llm.mlc.ai/docs/) to get started with MLC LLM.\n- [Installation](https://llm.mlc.ai/docs/install/mlc_llm)\n- [Quick start](https://llm.mlc.ai/docs/get_started/quick_start)\n- [Introduction](https://llm.mlc.ai/docs/get_started/introduction)\n\n## Citation\n\nPlease consider citing our project if you find it useful:\n\n```bibtex\n@software{mlc-llm,\n    author = {{MLC team}},\n    title = {{MLC-LLM}},\n    url = {https://github.com/mlc-ai/mlc-llm},\n    year = {2023-2025}\n}\n```\n\nThe underlying techniques of MLC LLM include:\n\n\u003cdetails\u003e\n  \u003csummary\u003eReferences (Click to expand)\u003c/summary\u003e\n\n  ```bibtex\n  @inproceedings{tensorir,\n      author = {Feng, Siyuan and Hou, Bohan and Jin, Hongyi and Lin, Wuwei and Shao, Junru and Lai, Ruihang and Ye, Zihao and Zheng, Lianmin and Yu, Cody Hao and Yu, Yong and Chen, Tianqi},\n      title = {TensorIR: An Abstraction for Automatic Tensorized Program Optimization},\n      year = {2023},\n      isbn = {9781450399166},\n      publisher = {Association for Computing Machinery},\n      address = {New York, NY, USA},\n      url = {https://doi.org/10.1145/3575693.3576933},\n      doi = {10.1145/3575693.3576933},\n      booktitle = {Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2},\n      pages = {804–817},\n      numpages = {14},\n      keywords = {Tensor Computation, Machine Learning Compiler, Deep Neural Network},\n      location = {Vancouver, BC, Canada},\n      series = {ASPLOS 2023}\n  }\n\n  @inproceedings{metaschedule,\n      author = {Shao, Junru and Zhou, Xiyou and Feng, Siyuan and Hou, Bohan and Lai, Ruihang and Jin, Hongyi and Lin, Wuwei and Masuda, Masahiro and Yu, Cody Hao and Chen, Tianqi},\n      booktitle = {Advances in Neural Information Processing Systems},\n      editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},\n      pages = {35783--35796},\n      publisher = {Curran Associates, Inc.},\n      title = {Tensor Program Optimization with Probabilistic Programs},\n      url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/e894eafae43e68b4c8dfdacf742bcbf3-Paper-Conference.pdf},\n      volume = {35},\n      year = {2022}\n  }\n\n  @inproceedings{tvm,\n      author = {Tianqi Chen and Thierry Moreau and Ziheng Jiang and Lianmin Zheng and Eddie Yan and Haichen Shen and Meghan Cowan and Leyuan Wang and Yuwei Hu and Luis Ceze and Carlos Guestrin and Arvind Krishnamurthy},\n      title = {{TVM}: An Automated {End-to-End} Optimizing Compiler for Deep Learning},\n      booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)},\n      year = {2018},\n      isbn = {978-1-939133-08-3},\n      address = {Carlsbad, CA},\n      pages = {578--594},\n      url = {https://www.usenix.org/conference/osdi18/presentation/chen},\n      publisher = {USENIX Association},\n      month = oct,\n  }\n  ```\n\u003c/details\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlc-ai%2Fmlc-llm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlc-ai%2Fmlc-llm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlc-ai%2Fmlc-llm/lists"}