{"id":49680775,"url":"https://github.com/manishklach/kv-cpu-driver","last_synced_at":"2026-05-11T10:01:26.626Z","repository":{"id":355928210,"uuid":"1228784181","full_name":"manishklach/kv-cpu-driver","owner":"manishklach","description":"Reference Linux control plane, RTL, and FPGA emulation scaffold for KV-CPU semantic KV-cache orchestration. Patent pending in India (App No. 202641056309).","archived":false,"fork":false,"pushed_at":"2026-05-07T05:27:47.000Z","size":4381,"stargazers_count":2,"open_issues_count":4,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-07T06:47:26.234Z","etag":null,"topics":["device-driver","fpga","kv-cache","linux-kernel","llm-inference","memory-tiering","pcie","rtl","systemverilog","vllm"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/manishklach.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-04T11:26:22.000Z","updated_at":"2026-05-07T05:27:51.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/manishklach/kv-cpu-driver","commit_stats":null,"previous_names":["manishklach/kv-cpu-driver"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/manishklach/kv-cpu-driver","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manishklach%2Fkv-cpu-driver","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manishklach%2Fkv-cpu-driver/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manishklach%2Fkv-cpu-driver/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manishklach%2Fkv-cpu-driver/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/manishklach","download_url":"https://codeload.github.com/manishklach/kv-cpu-driver/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/manishklach%2Fkv-cpu-driver/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32770544,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T02:36:36.067Z","status":"ssl_error","status_checked_at":"2026-05-08T02:36:07.210Z","response_time":54,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["device-driver","fpga","kv-cache","linux-kernel","llm-inference","memory-tiering","pcie","rtl","systemverilog","vllm"],"created_at":"2026-05-07T06:44:54.980Z","updated_at":"2026-05-09T08:01:47.615Z","avatar_url":"https://github.com/manishklach.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# KV-CPU Linux Control Plane (Reference Driver)\n\nLLM inference is becoming **memory-orchestration bound**, not compute bound. As context windows and batch sizes expand, the bottleneck shifts from raw FLOPS to the efficient movement and placement of the KV-cache across memory tiers. \n\nThis repository provides a reference Linux kernel driver that introduces a **semantic memory control plane** for a hypothetical KV-CPU device. It demonstrates how high-level transformer inference semantics can be exposed to hardware to enable intelligent, autonomous memory orchestration.\n\n---\n\n## The Problem\n\n1.  **KV Cache Growth:** Modern LLMs require massive KV caches that often exceed available GPU HBM, forcing offloading to slower memory tiers.\n2.  **Semantic Blindness:** Existing OS memory management (LRU, swapping) is \"semantic-blind\" to inference workloads. The kernel does not understand autoregressive **decode steps**, prefix sharing, or the predictable future access patterns of KV blocks, leading to inefficient eviction and high-latency page faults on the hot path.\n\n## The Key Idea: Semantic Signaling\n\nThe KV-CPU architecture proposes a shift where the inference runtime (e.g., vLLM, TensorRT-LLM) signals high-level intent to the hardware via the Linux kernel:\n\n-   **Decode Step Synchronization:** Informs the hardware of the current global iteration ($t$), allowing silicon-level logic to calculate block \"freshness.\"\n-   **Semantic Lifecycle Hints:** Explicitly identifies blocks as `HOT` (protected), `EVICTABLE` (candidate for offload), or `PREFETCHABLE` (needed in future steps).\n-   **Hardware-Level Orchestration:** Moves the eviction policy from reactive software to an autonomous hardware controller (HEPC) that operates directly on the memory data plane.\n\n## What This Repo Implements\n\nThis repository is a **control-plane prototype** and architectural reference. It implements:\n\n-   **Kernel Driver Skeleton:** A standard Linux `pci_driver` with character device registration (`/dev/kvcpu0`).\n-   **IOCTL Control Plane:** A standardized UAPI for signaling decode steps and block management.\n-   **MMIO Abstraction:** A clean register-access layer with a robust **Mock Mode** for testing without physical hardware.\n-   **Userspace Utility:** A reference tool (`kvctl`) to demonstrate interaction with the driver.\n-   **vLLM Adapter Sketch:** A lightweight Python wrapper under [`integrations/vllm/`](./integrations/vllm) that can issue `STEP_ADVANCE` on each decode step against `/dev/kvcpu0`.\n-   **Hardware Collateral:** Supporting RTL, MMIO, thermal, diagrams, and specification artifacts under [`hardware/`](./hardware).\n-   **FPGA Emulation Scaffold:** A Phase 1 emulation plan and starter integration structure under [`fpga/`](./fpga).\n\n## Implementation Status\n\nThe following table clarifies the scope of this reference implementation versus the conceptual hardware architecture.\n\n| Feature | Status |\n| :--- | :--- |\n| **Kernel Driver** | **Implemented** (Reference skeleton \u0026 lifecycle) |\n| **IOCTL Control Plane** | **Implemented** (UAPI and Dispatcher) |\n| **MMIO Interface** | **Stub / Mock** (Fake register writes) |\n| **DMA Engines** | **Not Implemented** (Signaling only) |\n| **HEPC Eviction Logic** | **Conceptual** (Silicon-level logic) |\n| **RTBD Directory** | **Conceptual** (Hardware tag storage) |\n| **madvise Integration** | **Not Implemented** |\n| **io_uring Integration** | **Not Implemented** |\n| **NUMA / HMAT** | **Not Implemented** |\n\n---\n\n## Architecture Diagram\n\n```text\n  [ User Space ]          [ Kernel Space ]          [ Hardware Space ]\n  (LLM Runtime)           (kv_cpu Driver)           (KV-CPU Silicon)\n       |                        |                         |\n       |--- ioctl(STEP) -------\u003e|                         |\n       |                        |--- MMIO Write ---------\u003e| [ HEPC Scorer ]\n       |                        |                         |       |\n       |--- ioctl(HOT/EVICT) --\u003e|                         | [ DMA Engine ]\n       |                        |--- MMIO Doorbell ------\u003e|       |\n       |                        |                         |       v\n       |                        |                         | [ Memory Tiers ]\n```\n\n## Example Flow: Semantic Lifecycle\n\n1.  **STEP:** Runtime signals `STEP 120` to the driver. Hardware HEPC re-evaluates all cached KV blocks based on their proximity to step 120.\n2.  **HOT:** Runtime identifies a specific prefix range as `HOT`. Hardware boosts its priority to prevent eviction.\n3.  **PREFETCH:** Runtime hints that a block will be needed at `STEP 256`. Hardware DMA autonomously moves the block from DRAM to LPDDR.\n4.  **EVICT:** Runtime marks a completed request's cache as `EVICTABLE`. Hardware immediately reclaims the space.\n\nIn this reference implementation, these lifecycle operations are modeled as MMIO control signals only. No DMA submission, page pinning, or physical data movement is performed by the driver.\n\n---\n\n## Build \u0026 Run\n\n### 1. Build the Driver and Tools\n```bash\nmake\n```\n\n### 2. Load the Module (Mock Mode)\n```bash\n# Load without physical hardware requirement\nsudo insmod kv_cpu.ko mock=1\n```\n\n### 3. Use the Reference Tool\n```bash\n# Signal a decode step\n./tools/kvctl step 128\n\n# Mark a range as HOT\n./tools/kvctl hot 0x7f001000 4096\n```\n\n### 4. Exercise the Python vLLM-Side Handshake\n```bash\n# Issue the same STEP_ADVANCE ioctl from Python\npython3 integrations/vllm/kv_cpu_allocator.py 128\n```\n\nThe adapter in [`integrations/vllm/`](./integrations/vllm) is intentionally small:\nit wraps a vLLM-style allocator, delegates real block management to vLLM, and\nadds an explicit `advance_decode_step()` hook that can be called once per decode\niteration in mock mode or against future hardware.\n\n## Disclaimer\n\n**This repository demonstrates a control-plane model for a KV-CPU device and is not a production-ready driver.** It is intended for systems architects and kernel engineers to evaluate the integration semantics of AI memory accelerators. No actual memory movement or performance simulation is performed by this driver.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanishklach%2Fkv-cpu-driver","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmanishklach%2Fkv-cpu-driver","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanishklach%2Fkv-cpu-driver/lists"}