An open API service indexing awesome lists of open source software.

https://github.com/codepawl/turboquant-torch

Unofficial PyTorch implementation of TurboQuant (Google Research, ICLR 2026). Near-optimal vector quantization for KV cache compression and vector search. 3-bit with zero accuracy loss.
https://github.com/codepawl/turboquant-torch

compression inference kv-cache llm pytorch quantization

Last synced: 2 months ago
JSON representation

Unofficial PyTorch implementation of TurboQuant (Google Research, ICLR 2026). Near-optimal vector quantization for KV cache compression and vector search. 3-bit with zero accuracy loss.

Awesome Lists containing this project