{"id":35636971,"url":"https://github.com/lmcache/lmcache","last_synced_at":"2026-06-13T03:11:17.475Z","repository":{"id":260139084,"uuid":"807305060","full_name":"LMCache/LMCache","owner":"LMCache","description":"Supercharge Your LLM with the Fastest KV Cache Layer","archived":false,"fork":false,"pushed_at":"2026-04-22T23:23:07.000Z","size":31649,"stargazers_count":8087,"open_issues_count":292,"forks_count":1118,"subscribers_count":42,"default_branch":"dev","last_synced_at":"2026-04-23T00:29:43.397Z","etag":null,"topics":["amd","cuda","fast","inference","kv-cache","llm","pytorch","rocm","speed","vllm"],"latest_commit_sha":null,"homepage":"https://lmcache.ai/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LMCache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":"MAINTAINERS.md","copyright":null,"agents":"AGENTS.md","dco":"DCO","cla":null}},"created_at":"2024-05-28T21:06:04.000Z","updated_at":"2026-04-23T00:00:26.000Z","dependencies_parsed_at":"2024-11-18T07:28:13.069Z","dependency_job_id":"f58e63d4-8b8c-4e91-94e9-9b8602359ce5","html_url":"https://github.com/LMCache/LMCache","commit_stats":null,"previous_names":["lmcache/lmcache"],"tags_count":38,"template":false,"template_full_name":null,"purl":"pkg:github/LMCache/LMCache","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LMCache%2FLMCache","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LMCache%2FLMCache/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LMCache%2FLMCache/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LMCache%2FLMCache/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LMCache","download_url":"https://codeload.github.com/LMCache/LMCache/tar.gz/refs/heads/dev","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LMCache%2FLMCache/sbom","scorecard":{"id":145100,"data":{"date":"2025-08-16T08:42:00Z","repo":{"name":"github.com/LMCache/LMCache","commit":"fdbfedb35c9ae247c106834805cd42b6c4342c4e"},"scorecard":{"version":"v5.2.1","commit":"ab2f6e92482462fe66246d9e32f642855a691dc1"},"score":7.2,"checks":[{"name":"Security-Policy","score":10,"reason":"security policy file detected","details":["Info: security policy file detected: SECURITY.md:1","Info: Found linked content: SECURITY.md:1","Info: Found disclosure, vulnerability, and/or timelines in security policy: SECURITY.md:1","Info: Found text in security policy: SECURITY.md:1"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#security-policy"}},{"name":"Maintained","score":10,"reason":"30 commit(s) and 4 issue activity found in the last 90 days -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#dangerous-workflow"}},{"name":"Dependency-Update-Tool","score":10,"reason":"update tool detected","details":["Info: detected update tool: Dependabot: .github/dependabot.yml:1"],"documentation":{"short":"Determines if the project uses a dependency update tool.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#dependency-update-tool"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#binary-artifacts"}},{"name":"Code-Review","score":9,"reason":"Found 25/27 approved changesets -- score normalized to 9","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#code-review"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Info: jobLevel 'packages' permission set to 'read': .github/workflows/codeql.yml:36","Info: jobLevel 'actions' permission set to 'read': .github/workflows/codeql.yml:39","Info: jobLevel 'contents' permission set to 'read': .github/workflows/codeql.yml:40","Info: jobLevel 'contents' permission set to 'read': .github/workflows/publish.yml:102","Warn: jobLevel 'contents' permission set to 'write': .github/workflows/publish.yml:144","Info: topLevel 'contents' permission set to 'read': .github/workflows/actionlint.yml:28","Warn: no topLevel permission defined: .github/workflows/build_doc.yml:1","Info: topLevel 'contents' permission set to 'read': .github/workflows/code_quality_checks.yml:9","Warn: no topLevel permission defined: .github/workflows/codeql.yml:1","Info: topLevel 'contents' permission set to 'read': .github/workflows/nightly_build.yml:8","Info: topLevel 'contents' permission set to 'read': .github/workflows/publish.yml:29","Info: topLevel permissions set to 'read-all': .github/workflows/scorecard.yml:18","Info: topLevel 'contents' permission set to 'read': .github/workflows/stale_bot.yml:15"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":5,"reason":"badge detected: Passing","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#cii-best-practices"}},{"name":"Pinned-Dependencies","score":6,"reason":"dependency not pinned by hash detected -- score normalized to 6","details":["Info: Possibly incomplete results: error parsing shell code: \u003e must be followed by a word: docker/example_run.sh:0","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/scorecard.yml:95: update your workflow using https://app.stepsecurity.io/secureworkflow/LMCache/LMCache/scorecard.yml/dev?enable=pin","Warn: containerImage not pinned by hash: docker/Dockerfile:15","Warn: containerImage not pinned by hash: docker/Dockerfile:93","Warn: containerImage not pinned by hash: docker/Dockerfile:133","Warn: downloadThenRun not pinned by hash: docker/Dockerfile:23-35","Warn: pipCommand not pinned by hash: .buildkite/scripts/multi-round-qa.sh:6","Warn: pipCommand not pinned by hash: .buildkite/scripts/multi-round-qa.sh:14","Warn: pipCommand not pinned by hash: .github/workflows/build_doc.yml:36","Warn: pipCommand not pinned by hash: .github/workflows/build_doc.yml:37","Warn: pipCommand not pinned by hash: .github/workflows/publish.yml:69","Warn: pipCommand not pinned by hash: .github/workflows/publish.yml:70","Info:  22 out of  23 GitHub-owned GitHubAction dependencies pinned","Info:  20 out of  20 third-party GitHubAction dependencies pinned","Info:   0 out of   1 downloadThenRun dependencies pinned","Info:   0 out of   6 pipCommand dependencies pinned","Info:   1 out of   4 containerImage dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#pinned-dependencies"}},{"name":"Vulnerabilities","score":8,"reason":"2 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2020-73","Warn: Project is vulnerable to: PYSEC-2017-74"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":10,"reason":"SAST tool is run on all commits","details":["Info: SAST configuration detected: CodeQL","Info: all commits (30) are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#sast"}},{"name":"Packaging","score":10,"reason":"packaging workflow detected","details":["Info: Project packages its releases by way of GitHub Actions.: .github/workflows/nightly_build.yml:11"],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#packaging"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact v0.3.3 not signed: https://api.github.com/repos/LMCache/LMCache/releases/237196790","Warn: release artifact v0.3.2 not signed: https://api.github.com/repos/LMCache/LMCache/releases/232403772","Warn: release artifact v0.3.1.post1 not signed: https://api.github.com/repos/LMCache/LMCache/releases/228105943","Warn: release artifact v0.3.1 not signed: https://api.github.com/repos/LMCache/LMCache/releases/227797292","Warn: release artifact v0.3.0 not signed: https://api.github.com/repos/LMCache/LMCache/releases/221667451","Warn: release artifact v0.3.3 does not have provenance: https://api.github.com/repos/LMCache/LMCache/releases/237196790","Warn: release artifact v0.3.2 does not have provenance: https://api.github.com/repos/LMCache/LMCache/releases/232403772","Warn: release artifact v0.3.1.post1 does not have provenance: https://api.github.com/repos/LMCache/LMCache/releases/228105943","Warn: release artifact v0.3.1 does not have provenance: https://api.github.com/repos/LMCache/LMCache/releases/227797292","Warn: release artifact v0.3.0 does not have provenance: https://api.github.com/repos/LMCache/LMCache/releases/221667451"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":5,"reason":"branch protection is not maximal on development and all release branches","details":["Info: 'allow deletion' disabled on branch 'dev'","Info: 'force pushes' disabled on branch 'dev'","Warn: 'branch protection settings apply to administrators' is disabled on branch 'dev'","Warn: 'stale review dismissal' is disabled on branch 'dev'","Info: required approving review count is 2 on branch 'dev'","Warn: codeowners review is required - but no codeowners file found in repo","Warn: 'last push approval' is disabled on branch 'dev'","Info: status check found to merge onto on branch 'dev'","Info: PRs are required in order to make changes on branch 'dev'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#branch-protection"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#license"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#fuzzing"}},{"name":"CI-Tests","score":10,"reason":"30 out of 30 merged PRs checked by a CI test -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project runs tests before pull requests are merged.","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#ci-tests"}},{"name":"Contributors","score":10,"reason":"project has 14 contributing companies or organizations","details":["Info: found contributions from: IBM, RPIGroup, apache, buildsome, ibm, instructlab, kernelim, kernelim ltd, rust-lang, tencent.com, tensormesh, uchicago, university of chicago, zhejiang university"],"documentation":{"short":"Determines if the project has a set of contributors from multiple organizations (e.g., companies).","url":"https://github.com/ossf/scorecard/blob/ab2f6e92482462fe66246d9e32f642855a691dc1/docs/checks.md#contributors"}}]},"last_synced_at":"2025-08-16T09:14:07.076Z","repository_id":260139084,"created_at":"2025-08-16T09:14:07.076Z","updated_at":"2025-08-16T09:14:07.076Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32259472,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-25T09:15:33.318Z","status":"ssl_error","status_checked_at":"2026-04-25T09:15:31.997Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amd","cuda","fast","inference","kv-cache","llm","pytorch","rocm","speed","vllm"],"created_at":"2026-01-05T10:09:51.667Z","updated_at":"2026-06-13T03:11:17.470Z","avatar_url":"https://github.com/LMCache.png","language":"Python","funding_links":[],"categories":["Deployment and Serving"],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cp align=\"center\"\u003e\n    \u003cimg src=\"asset/logo.png\" alt=\"lmcache logo\" width=\"45%\"\u003e\n  \u003c/p\u003e\n  \u003ch3 align=\"center\"\u003e\n    A KV Cache Management Layer for Scalable LLM Inference\n  \u003c/h3\u003e\n    \u003chr width=\"78%\"\u003e\n\n  \u003ch3 align=\"center\"\u003e\n    \u003ca href=\"https://blog.lmcache.ai/\"\u003eBlog\u003c/a\u003e |\n    \u003ca href=\"https://docs.lmcache.ai/\"\u003eDocumentation\u003c/a\u003e |\n    \u003ca href=\"https://join.slack.com/t/lmcacheworkspace/shared_invite/zt-3zxjao8h0-lRfBfnLqbALOtLsWn2ITxA\"\u003eJoin Slack\u003c/a\u003e |\n    \u003ca href=\"https://docs.lmcache.ai/community/meetings.html\"\u003eCommunity Meeting\u003c/a\u003e |\n    \u003ca href=\"https://github.com/LMCache/LMCache/issues/2923\"\u003eRoadmap\u003c/a\u003e\n  \u003c/h3\u003e\n\n  [![PyPI](https://img.shields.io/pypi/v/lmcache)](https://pypi.org/project/lmcache/)\n  [![PyPI - Downloads](https://img.shields.io/pypi/dm/lmcache)](https://pypi.org/project/lmcache/)\n  [![GitHub commit activity](https://img.shields.io/github/commit-activity/w/LMCache/LMCache)](https://github.com/LMCache/LMCache/graphs/commit-activity)\n  [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/LMCache/LMCache/)\n\n\u003c/div\u003e\n\n## Updates\n- [2026/05] 🔥 Agentic workload benchmark on AMD MI300X ([blog](https://blog.lmcache.ai/en/2026/05/12/benchmarking-lmcache-for-multi-turn-agentic-workloads-on-amd-mi300x/)).\n- [2026/04] 🔥 LMCache's new multiprocess(MP) architecture release ([blog](https://blog.lmcache.ai/en/2026/04/03/lmcaches-new-architecture-boosts-moe-inference-performance-by-10x/)).\n- [2026/03] LMCache at GTC 2026 ([post](https://www.linkedin.com/posts/lmcache-lab_llm-opensource-nvidiagtc-activity-7442721875664826369-pMAu?utm_source=share\u0026utm_medium=member_desktop\u0026rcm=ACoAADkIIvQBTyG53kXXX70OZdE5rhpllYQqmIA)).\n- [2026/01] LMCache multi-node P2P CPU memory sharing, from experimental feature to production ([blog](https://blog.lmcache.ai/en/2026/01/21/p2p-1/)).\n\n\u003cdetails\u003e\n\u003csummary\u003eMore\u003c/summary\u003e\n\n- [2025/11] LMCache x CoreWeave accelerate efficient LLM inference for Cohere ([blog](https://blog.lmcache.ai/en/2025/10/29/breaking-the-memory-barrier-how-lmcache-and-coreweave-power-efficient-llm-inference-for-cohere/)).\n- [2025/10] LMCache joins the PyTorch Foundation and Tensormesh unveiled ([blog](https://blog.lmcache.ai/en/2025/10/31/tensormesh-unveiled-and-lmcache-joins-the-pytorch-foundation/), [PyTorch](https://pytorch.org/blog/lmcache-joins-pytorch-ecosystem/)).\n- [2025/09] NVIDIA Dynamo integrates LMCache, accelerating LLM inference ([blog](https://blog.lmcache.ai/en/2025/09/18/nvidia-dynamo-integrates-lmcache-accelerating-llm-inference/)).\n- [2025/08] 🎉 LMCache hits 5,000+ GitHub stars ([blog](https://blog.lmcache.ai/en/2025/08/28/%f0%9f%8e%89-lmcache-hits-5000-github-stars-thank-you-community/)).\n- [2025/08] LMCache supports gpt-oss (20B/120B) on day 1 ([blog](https://blog.lmcache.ai/en/2025/08/05/lmcache-supports-gpt-oss-20b-120b-on-day-1/)).\n- [2025/07] Get faster LLM inference and cheaper responses with LMCache and Redis ([Redis blog](https://redis.io/blog/get-faster-llm-inference-and-cheaper-responses-with-lmcache-and-redis/)).\n- [2025/07] LMCache extends its turbo-boost to multimodal models in vLLM V1 ([blog](https://blog.lmcache.ai/en/2025/07/03/lmcache-extends-its-turbo-boost-to-multimodal-models-in-vllm-v1/)).\n- [2025/06] LLM Production Stack goes cross-hardware: AMD, Arm and Ascend ([blog](https://blog.lmcache.ai/en/2025/06/20/llm-production-stack-goes-cross-hardware-ascend-arm-and-amd-support-incoming/)).\n\n\u003c/details\u003e\n\n## About\n\nLMCache is a **KV cache management layer** for LLM inference. It turns KV cache from a temporary state into reusable *AI-native knowledge* that can be *stored* persistently, *reused* across multiple serving engines, *monitored* with an observability stack, and *transformed* for better generation quality. As a result, LMCache **reduces TTFT** (time-to-first-token) and **improves throughput**, especially for long-context agentic, multi-turn conversation, and knowledge-augmented workloads (e.g., RAG).\n\nLMCache is **vendor-neutral**. It can be used as a KV cache layer for a range of mainstream open-source serving engines, inference frameworks, hardware vendors, storage systems, and infrastructure providers. The vendor neutrality allows users to freely switch between serving engines and storage vendors, while reusing the stored KV caches.\n\n\u003cp align=\"center\"\u003e\n\u003cpicture\u003e\n  \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"asset/deployment_modes_dark.png\"\u003e\n  \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"asset/deployment_modes_light.png\"\u003e\n  \u003cimg alt=\"LMCache Deployment Modes\" src=\"asset/deployment_modes_light.png\"\u003e\n\u003c/picture\u003e\n\u003c/p\u003e\n\n### Key features\n\n- **Engine-independent deployment**: LMCache, as a standalone daemon process, manages KV cache independently from the inference engine process, so that KV cache will not be lost even if the inference engine crashes (i.e., no fate-sharing with engines).\n\n- **Persistent, tiered KV cache offloading and reuse**: Move KV caches out of GPU memory into a tiered storage hierarchy spanning CPU memory, local storage, and remote backends, enabling reuse across requests, sessions, and engine instances to reduce repeated prefill computation and improve TTFT.\n\n- **Production-level KV cache observability**: LMCache provides a rich set of KV cache observability metrics, including typical Kubernetes metrics (health monitoring, performance diagnostics), KV-cache-specific metrics (request-level and token-level prefix cache hits, lifecycle, request-level KV cache performance), management metrics (user-specific usage), and more.\n\n- **Pluggable storage and transport backends**: Easily integrate remote storage and KV transfer backends through a unified interface, enabling KV cache offloading and sharing across storage providers. Through this interface, LMCache supports storage backends including CPU RAM, local disk (SSD), Redis/Valkey, Mooncake, InfiniStore, S3-compatible object storage, NIXL, and GDS.\n\n- **Non-prefix KV reuse**: Extend KV reuse beyond prefix caching by reusing cached KV blocks at any position in the prompt. This leverages CacheBlend to selectively recompute tokens for quality recovery.\n\n- **PD disaggregation and KV transfer**: Support KV cache transfer from prefill workers to decode workers over NVLink, RDMA, or TCP through transport layers such as NIXL.\n\n- **Pluggable KV transformation**: A simple interface for researchers to write compression, token dropping, and custom serialization through a flexible SERDE interface.\n\nLMCache is becoming an integral layer in the LLM inference *ecosystem*, with *community*-driven integration with serving engines, inference frameworks, hardware vendors, storage systems, and infrastructure providers:\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"asset/ecosystem.png\" alt=\"LMCache ecosystem\"\u003e\n\u003c/p\u003e\n\n## Getting Started\n\nTo use LMCache, simply install `lmcache` from your package manager, e.g. pip:\n```bash\npip install lmcache\n```\n\nFor more setup options and examples, see:\n- [Installation](https://docs.lmcache.ai/getting_started/installation.html)\n- [Quickstart](https://docs.lmcache.ai/getting_started/quickstart.html)\n- [LMCache Recipes](https://docs.lmcache.ai/recipes/index.html)\n- [CLI Reference](https://docs.lmcache.ai/cli/index.html)\n- [Benchmarking Guide](https://docs.lmcache.ai/getting_started/benchmarking.html)\n- [Production Deployment](https://docs.lmcache.ai/mp/deployment.html)\n\n## Contributing\nWe welcome and value contributions and collaborations. Join us in improving LMCache. Check out the [Contributing Guide](https://docs.lmcache.ai/developer_guide/contributing.html) or join our [Slack community](https://join.slack.com/t/lmcacheworkspace/shared_invite/zt-3zxjao8h0-lRfBfnLqbALOtLsWn2ITxA) to get started.\n\n## Adoption and Partnerships\nLMCache has a growing community of developers, researchers, industry adopters, and partners building the next generation of efficient LLM inference systems.\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"asset/partner_dark.png\"\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"asset/partner_light.png\"\u003e\n    \u003cimg alt=\"LMCache Adoption and Partnerships\" src=\"asset/partner_light.png\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\nAs an independent open-source project, LMCache is becoming the de-facto standard for KV Cache management in LLM inference. Its continued development and community work are supported in part by [Tensormesh](https://www.tensormesh.ai/).\n\n## Citation\n\nLMCache builds on research in KV cache management, including cache reuse, offloading, compression, and serving optimization. If you use LMCache in your research, please cite the LMCache paper and related work.\n\n~~~bibtex\n@article{cheng2025lmcache,\n  title={LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference},\n  author={Cheng, Yihua and Liu, Yuhan and Yao, Jiayi and An, Yuwei and Chen, Xiaokun and Feng, Shaoting and Huang, Yuyang and Shen, Samuel and Du, Kuntai and Jiang, Junchen},\n  journal={arXiv preprint arXiv:2510.09665},\n  year={2025}\n}\n~~~\n\n\u003cdetails\u003e\n\u003csummary\u003eRelated papers\u003c/summary\u003e\n\n~~~bibtex\n@inproceedings{liu2024cachegen,\n  title={Cachegen: Kv cache compression and streaming for fast large language model serving},\n  author={Liu, Yuhan and Li, Hanchen and Cheng, Yihua and Ray, Siddhant and Huang, Yuyang and Zhang, Qizheng and Du, Kuntai and Yao, Jiayi and Lu, Shan and Ananthanarayanan, Ganesh and others},\n  booktitle={Proceedings of the ACM SIGCOMM 2024 Conference},\n  pages={38--56},\n  year={2024}\n}\n\n@inproceedings{yao2025cacheblend,\n  title={Cacheblend: Fast large language model serving for rag with cached knowledge fusion},\n  author={Yao, Jiayi and Li, Hanchen and Liu, Yuhan and Ray, Siddhant and Cheng, Yihua and Zhang, Qizheng and Du, Kuntai and Lu, Shan and Jiang, Junchen},\n  booktitle={Proceedings of the twentieth European conference on computer systems},\n  pages={94--109},\n  year={2025}\n}\n~~~\n\n\u003c/details\u003e\n\n## License\n\nThe LMCache codebase is licensed under Apache License 2.0. See the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flmcache%2Flmcache","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flmcache%2Flmcache","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flmcache%2Flmcache/lists"}