{"id":26103499,"url":"https://github.com/inftyai/llmlite","last_synced_at":"2025-04-12T17:23:46.284Z","repository":{"id":193046625,"uuid":"687343922","full_name":"InftyAI/llmlite","owner":"InftyAI","description":"🌵 A library helps to communicate with all kinds of LLMs consistently.","archived":false,"fork":false,"pushed_at":"2024-07-05T21:08:08.000Z","size":252,"stargazers_count":18,"open_issues_count":18,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-12T17:23:39.540Z","etag":null,"topics":["llmops"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/InftyAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-09-05T07:01:30.000Z","updated_at":"2025-03-22T12:04:13.000Z","dependencies_parsed_at":"2023-10-16T02:46:07.994Z","dependency_job_id":"4153985e-2b17-487b-ba33-21e9eb604708","html_url":"https://github.com/InftyAI/llmlite","commit_stats":null,"previous_names":["inftyai/chatllm","inftyai/llmlite"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InftyAI%2Fllmlite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InftyAI%2Fllmlite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InftyAI%2Fllmlite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InftyAI%2Fllmlite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/InftyAI","download_url":"https://codeload.github.com/InftyAI/llmlite/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248602874,"owners_count":21131705,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llmops"],"created_at":"2025-03-09T20:07:13.954Z","updated_at":"2025-04-12T17:23:46.256Z","avatar_url":"https://github.com/InftyAI.png","language":"Python","readme":"# llmlite\n\n [![Latest Release](https://img.shields.io/github/v/release/inftyai/llmlite?include_prereleases)](https://github.com/inftyai/llmlite/releases/latest)\n\n**🌵** llmlite is a library helps to communicate with all kinds of LLMs consistently.\n\n## Features\n\n- State-of-the-art LLMs support\n- Continuous Batching via [vLLM](https://github.com/vllm-project/vllm)\n- Quantization([issue#37] (\u003chttps://github.com/InftyAI/llmlite/issues/37\u003e))\n- Loading specific adapters ([issue#51](https://github.com/InftyAI/llmlite/issues/51))\n- Streaming ([issue#52](https://github.com/InftyAI/llmlite/issues/52))\n\n### Model Support\n\n| Model | State | System Prompt | Note |\n| ---- | ---- | ---- | ---- |\n| ChatGPT | Done ✅ | Yes | |\n| Llama-2 | Done ✅ | Yes | |\n| CodeLlama | Done ✅ | Yes | |\n| ChatGLM2 | Done ✅ | No | |\n| Baichuan2 | Done ✅ | Yes | |\n| ChatGLM3 | WIP ⏳ | Yes | |\n| Claude-2 | RoadMap 📋 | | [issue#7](https://github.com/InftyAI/ChatLLM/issues/7)\n| Falcon | RoadMap 📋 | | [issue#8](https://github.com/InftyAI/ChatLLM/issues/8)\n| StableLM | RoadMap 📋 | | [issue#11](https://github.com/InftyAI/ChatLLM/issues/11) |\n\n### Backend Support\n\n| backend | State |\n| ---- | ---- |\n| [huggingface](https://github.com/huggingface) | Done ✅ |\n| [vLLM](https://github.com/vllm-project/vllm) | Done ✅ |\n\n## How to install\n\n```cmd\npip install llmlite==0.0.15\n```\n\n## How to use\n\n### Chat\n\n```python\nfrom llmlite import ChatLLM, ChatMessage\n\nchat = ChatLLM(\n    model_name_or_path=\"meta-llama/Llama-2-7b-chat-hf\", # required\n    task=\"text-generation\",\n    )\n\nresult = chat.completion(\n  messages=[\n    ChatMessage(role=\"system\", content=\"You're a honest assistant.\"),\n    ChatMessage(role=\"user\", content=\"There's a llama in my garden, what should I do?\"),\n  ]\n)\n\n# Output: Oh my goodness, a llama in your garden?! 😱 That's quite a surprise! 😅 As an honest assistant, I must inform you that llamas are not typically known for their gardening skills, so it's possible that the llama in your garden may have wandered there accidentally or is seeking shelter. 🐮 ...\n\n```\n\n### Continuous Batching\n\n_This is mostly supported by vLLM, you can enable this by configuring the **backend**._\n\n```python\nfrom llmlite import ChatLLM, ChatMessage\n\nchat = ChatLLM(\n    model_name_or_path=\"meta-llama/Llama-2-7b-chat-hf\",\n    backend=\"vllm\",\n)\n\nresults = chat.completion(\n    messages=[\n        [\n            ChatMessage(role=\"system\", content=\"You're a honest assistant.\"),\n            ChatMessage( role=\"user\", content=\"There's a llama in my garden, what should I do?\"),\n        ],\n        [\n            ChatMessage(role=\"user\", content=\"What's the population of the world?\"),\n        ],\n    ],\n    max_tokens=2048,\n)\n\nfor result in results:\n    print(f\"RESULT: \\n{result}\\n\\n\")\n```\n\n`llmlite` also supports other parameters like `temperature`, `max_length`, `do_sample`, `top_k`, `top_p` to help control the length, randomness and diversity of the generated text.\n\nSee **[examples](./examples/)** for reference.\n\n### Prompting\n\nYou can use `llmlite` to help you generate full prompts, for instance:\n\n```python\nfrom llmlite import ChatLLM\n\nmessages = [\n    ChatMessage(role=\"system\", content=\"You're a honest assistant.\"),\n    ChatMessage(role=\"user\", content=\"There's a llama in my garden, what should I do?\"),\n]\n\nChatLLM.prompt(\"meta-llama/Llama-2-7b-chat-hf\", messages)\n\n# Output:\n# \u003cs\u003e[INST] \u003c\u003cSYS\u003e\u003e\n# You're a honest assistant.\n# \u003c\u003c/SYS\u003e\u003e\n\n# There's a llama in my garden, what should I do? [/INST]\n```\n\n### Logging\n\nSet the env variable `LOG_LEVEL` for log configuration, default to `INFO`, others like DEBUG, INFO, WARNING etc..\n\n## Contributions\n\n🚀 All kinds of contributions are welcomed ! Please follow [Contributing](/CONTRIBUTING.md).\n\n## Contributors\n\n🎉 Thanks to all these contributors.\n\n\u003ca href=\"https://github.com/InftyAI/ChatLLM/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=InftyAI/ChatLLM\" /\u003e\n\u003c/a\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finftyai%2Fllmlite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finftyai%2Fllmlite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finftyai%2Fllmlite/lists"}