{"id":19631040,"url":"https://github.com/fasterdecoding/snapkv","last_synced_at":"2025-08-03T19:07:34.401Z","repository":{"id":235261977,"uuid":"776297458","full_name":"FasterDecoding/SnapKV","owner":"FasterDecoding","description":null,"archived":false,"fork":false,"pushed_at":"2024-05-01T00:32:13.000Z","size":871,"stargazers_count":261,"open_issues_count":17,"forks_count":19,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-07-09T13:56:10.159Z","etag":null,"topics":["long-context-modeling"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FasterDecoding.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-23T05:36:18.000Z","updated_at":"2025-07-09T07:24:49.000Z","dependencies_parsed_at":"2024-12-26T12:07:24.049Z","dependency_job_id":"51d1237e-47a9-415f-9ec9-6da8b8db9fb3","html_url":"https://github.com/FasterDecoding/SnapKV","commit_stats":null,"previous_names":["fasterdecoding/snapkv"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/FasterDecoding/SnapKV","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FasterDecoding%2FSnapKV","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FasterDecoding%2FSnapKV/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FasterDecoding%2FSnapKV/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FasterDecoding%2FSnapKV/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FasterDecoding","download_url":"https://codeload.github.com/FasterDecoding/SnapKV/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FasterDecoding%2FSnapKV/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268596566,"owners_count":24275909,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-03T02:00:12.545Z","response_time":2577,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["long-context-modeling"],"created_at":"2024-11-11T12:07:38.436Z","updated_at":"2025-08-03T19:07:34.348Z","avatar_url":"https://github.com/FasterDecoding.png","language":"Python","readme":"# SnapKV :camera:\nWe introduce an innovative and out-of-box KV cache compression method, [SnapKV](https://arxiv.org/abs/2404.14469).\n## Requirements\nCurrently tested with `transformers==4.37.0`, need to check if it is compatible with higher version.\n```\ntransformers\u003e=4.36\nflash-attn==2.4.0\n```\n## Installation\n```\ngit clone git@github.com:FasterDecoding/SnapKV.git\ncd SnapKV\npip install -e .\n```\n## Quick Start\n### Use SnapKV-optimized Models\nFor example: \n```python\nfrom snapkv.monkeypatch.monkeypatch import replace_mistral\nreplace_mistral() # Use monkey patches enable SnapKV\n```\n\nCheck [the example notebook](./notebooks/example.ipynb).\n\n### Customize Your SnapKV-optimized Models\nSnapKV can be easily integrated with other models. \n\nYou can follow the comment marked with `[SnapKV]` in [existing models](./snapkv/monkeypatch/monkeypatch.py) to construct your own models. (Currently we support [Llama family](./snapkv/monkeypatch/llama_hijack_4_37.py)/ [Mistral](./snapkv/monkeypatch//mistral_hijack_4_37.py)/ [Mixtral](./snapkv/monkeypatch//mixtral_hijack_4_37.py)) \n\nThe detailed algorithm of SnapKV is in [`snapkv_utils.py`](./snapkv/monkeypatch/snapkv_utils.py)\n\n\n## Partial Results\n![Comprehensive Experiment Results on LongBench](./assets/longbench.jpg)\n![Pressure Test Result on Needle-in-a-Haystack](./assets/LWM-Text-Chat-1M_SnapKV.jpg)\n\n## TODO\n- [ ] Add observation experiments for reduplication.\n- [ ] Add LongBench for reduplication.\n- [ ] Explore the prompt phase compression.\n\n## Citation\nIf you feel this project is helpful, please consider cite our report :blush:\n```\n@article{li2024snapkv,\n  title={SnapKV: LLM Knows What You are Looking for Before Generation},\n  author={Li, Yuhong and Huang, Yingbing and Yang, Bowen and Venkitesh, Bharat and Locatelli, Acyr and Ye, Hanchen and Cai, Tianle and Lewis, Patrick and Chen, Deming},\n  journal={arXiv preprint arXiv:2404.14469},\n  year={2024}\n}\n```","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffasterdecoding%2Fsnapkv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffasterdecoding%2Fsnapkv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffasterdecoding%2Fsnapkv/lists"}