{"id":13442038,"url":"https://github.com/ModelTC/lightllm","last_synced_at":"2025-03-20T13:31:58.566Z","repository":{"id":182981032,"uuid":"669420857","full_name":"ModelTC/lightllm","owner":"ModelTC","description":"LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.","archived":false,"fork":false,"pushed_at":"2024-10-29T04:42:19.000Z","size":2510,"stargazers_count":2553,"open_issues_count":67,"forks_count":200,"subscribers_count":23,"default_branch":"main","last_synced_at":"2024-10-29T10:03:22.518Z","etag":null,"topics":["deep-learning","gpt","llama","llm","model-serving","nlp","openai-triton"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ModelTC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-07-22T08:11:15.000Z","updated_at":"2024-10-29T09:14:42.000Z","dependencies_parsed_at":"2024-01-05T08:24:44.068Z","dependency_job_id":"1ad4575b-2eb1-439e-a784-0bf0c6b5f006","html_url":"https://github.com/ModelTC/lightllm","commit_stats":{"total_commits":310,"total_committers":35,"mean_commits":8.857142857142858,"dds":0.5935483870967742,"last_synced_commit":"71d12085fc1b29767ff62d878b98d4e3692de3aa"},"previous_names":["modeltc/lightllm"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelTC%2Flightllm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelTC%2Flightllm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelTC%2Flightllm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelTC%2Flightllm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ModelTC","download_url":"https://codeload.github.com/ModelTC/lightllm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244560343,"owners_count":20472219,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","gpt","llama","llm","model-serving","nlp","openai-triton"],"created_at":"2024-07-31T03:01:40.978Z","updated_at":"2025-03-20T13:31:58.560Z","avatar_url":"https://github.com/ModelTC.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003cimg alt=\"LightLLM\" src=\"assets/lightllm.drawio.png\" width=90%\u003e\n  \u003c/picture\u003e\n\u003c/div\u003e\n\n---\n\u003cdiv align=\"center\"\u003e\n\n[![docs](https://img.shields.io/badge/docs-latest-blue)](https://lightllm-en.readthedocs.io/en/latest/)\n[![Docker](https://github.com/ModelTC/lightllm/actions/workflows/docker-publish.yml/badge.svg)](https://github.com/ModelTC/lightllm/actions/workflows/docker-publish.yml)\n[![stars](https://img.shields.io/github/stars/ModelTC/lightllm?style=social)](https://github.com/ModelTC/lightllm)\n![visitors](https://komarev.com/ghpvc/?username=lightllm\u0026label=visitors)\n[![Discord Banner](https://img.shields.io/discord/1139835312592392214?logo=discord\u0026logoColor=white)](https://discord.gg/WzzfwVSguU)\n[![license](https://img.shields.io/github/license/ModelTC/lightllm)](https://github.com/ModelTC/lightllm/blob/main/LICENSE)\n\u003c/div\u003e\n\nLightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance. LightLLM harnesses the strengths of numerous well-regarded open-source implementations, including but not limited to FasterTransformer, TGI, vLLM, and FlashAttention.\n\n\n[English Docs](https://lightllm-en.readthedocs.io/en/latest/) | [中文文档](https://lightllm-cn.readthedocs.io/en/latest/) | [Blogs](https://modeltc.github.io/lightllm-blog/)\n\n## News\n- [2025/02] 🔥 LightLLM v1.0.0 release, achieving the **fastest DeepSeek-R1** serving performance on single H200 machine.\n\n## Get started\n\n- [Install LightLLM](https://lightllm-en.readthedocs.io/en/latest/getting_started/installation.html)\n- [Quick Start](https://lightllm-en.readthedocs.io/en/latest/getting_started/quickstart.html)\n- [LLM Service](https://lightllm-en.readthedocs.io/en/latest/models/test.html#llama)\n- [VLM Service](https://lightllm-en.readthedocs.io/en/latest/models/test.html#llava)\n\n\n## Performance\n\nLearn more in the release blogs: [v1.0.0 blog](https://www.light-ai.top/lightllm-blog//by%20mtc%20team/2025/02/16/lightllm/).\n\n## FAQ\n\nPlease refer to the [FAQ](https://lightllm-en.readthedocs.io/en/latest/faq.html) for more information.\n\n## Projects using lightllm\n\nWe welcome any coopoeration and contribution. If there is a project requires lightllm's support, please contact us via email or create a pull request.\n\n\n1. \u003cdetails\u003e\u003csummary\u003e \u003cb\u003e\u003ca href=https://github.com/LazyAGI/LazyLLM\u003eLazyLLM\u003c/a\u003e\u003c/b\u003e: Easyest and lazyest way for building multi-agent LLMs applications.\u003c/summary\u003e\n\n    Once you have installed `lightllm` and `lazyllm`, and then you can use the following code to build your own chatbot:\n\n    ~~~python\n    from lazyllm import TrainableModule, deploy, WebModule\n    # Model will be download automatically if you have an internet connection\n    m = TrainableModule('internlm2-chat-7b').deploy_method(deploy.lightllm)\n    WebModule(m).start().wait()\n    ~~~\n\n    Documents: https://lazyllm.readthedocs.io/\n\n    \u003c/details\u003e\n\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=ModelTC/lightllm\u0026type=Timeline)](https://star-history.com/#ModelTC/lightllm\u0026Timeline)\n\n## Community\n\nFor further information and discussion, [join our discord server](https://discord.gg/WzzfwVSguU). Welcome to be a member and look forward to your contribution!\n\n## License\n\nThis repository is released under the [Apache-2.0](LICENSE) license.\n\n## Acknowledgement\n\nWe learned a lot from the following projects when developing LightLLM.\n- [Faster Transformer](https://github.com/NVIDIA/FasterTransformer)\n- [Text Generation Inference](https://github.com/huggingface/text-generation-inference)\n- [vLLM](https://github.com/vllm-project/vllm)\n- [Flash Attention 1\u00262](https://github.com/Dao-AILab/flash-attention)\n- [OpenAI Triton](https://github.com/openai/triton)\n","funding_links":[],"categories":["Python","**Model Compression for Large Language Models**","A01_文本生成_文本对话","📖Contents","Model Serving","Deployment and Serving","🔓 Open Source Inference Engines","Inference Engines \u0026 Backends (22)","LLM Serving / Inference","LLM 部署与推理 (Deployment \u0026 Inference)"],"sub_categories":["**Memory Optimization**","大语言对话模型及数据","📖LLM Train/Inference Framework/Design ([©️back👆🏻](#paperlist))","推理引擎 (Inference Engines)"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FModelTC%2Flightllm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FModelTC%2Flightllm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FModelTC%2Flightllm/lists"}