{"id":20298145,"url":"https://github.com/hemingkx/Spec-Bench","last_synced_at":"2025-05-07T20:34:29.844Z","repository":{"id":223599214,"uuid":"760971019","full_name":"hemingkx/Spec-Bench","owner":"hemingkx","description":"Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)","archived":false,"fork":false,"pushed_at":"2025-04-19T04:32:38.000Z","size":4077,"stargazers_count":252,"open_issues_count":3,"forks_count":31,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-19T11:51:01.465Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://sites.google.com/view/spec-bench","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hemingkx.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-21T01:56:26.000Z","updated_at":"2025-04-19T04:32:41.000Z","dependencies_parsed_at":"2024-02-24T04:24:26.743Z","dependency_job_id":"14cc1364-cdbf-4a1d-bbc2-809353579475","html_url":"https://github.com/hemingkx/Spec-Bench","commit_stats":null,"previous_names":["hemingkx/spec-bench"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hemingkx%2FSpec-Bench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hemingkx%2FSpec-Bench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hemingkx%2FSpec-Bench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hemingkx%2FSpec-Bench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hemingkx","download_url":"https://codeload.github.com/hemingkx/Spec-Bench/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252953717,"owners_count":21830890,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-14T16:02:16.293Z","updated_at":"2025-05-07T20:34:29.817Z","avatar_url":"https://github.com/hemingkx.png","language":"Python","funding_links":[],"categories":["A01_文本生成_文本对话"],"sub_categories":["大语言对话模型及数据"],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003ch2\u003e\u003cimg src=\"assets/logo.png\" height=\"28px\"/\u003e\u003ci\u003eSpec-Bench:\u003c/i\u003e A Comprehensive Benchmark and Unified\u003cbr\u003eEvaluation Platform for Speculative Decoding\u003c/h2\u003e \n\u003c/div\u003e\n\u003cp align=\"center\"\u003e\n| \u003ca href=\"https://arxiv.org/abs/2401.07851\"\u003e\u003cb\u003ePaper\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://sites.google.com/view/spec-bench/\"\u003e\u003cb\u003eBlog\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://github.com/hemingkx/Spec-Bench/blob/main/Leaderboard.md\"\u003e\u003cb\u003eLeaderboard\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"ROADMAP.md\"\u003e\u003cb\u003eRoadmap\u003c/b\u003e\u003c/a\u003e |\n\u003c/p\u003e\n\n\n\n\n\n![timeline](./assets/7B.png)\n\n\u003cdiv align=\"center\"\u003e\n\u003cfont color=\"gray\"\u003eSpeedup comparison of Speculative Decoding methods on Spec-Bench, evaluated by Vicuna-7B-v1.3.\u003c/font\u003e\n\u003c/div\u003e\n\n## Introduction\n\nSpec-Bench is a comprehensive benchmark designed for assessing Speculative Decoding methods across diverse scenarios. Based on Spec-Bench, we aim to establish and maintain a unified evaluation platform for open-source Speculative Decoding approaches. This platform facilitates the systematic assessment of existing methods ***in the same device and testing environment***, thereby ensuring fair comparisons. \n\nCurrently, Spec-Bench supports the evaluation of the following open source models:\n\n- [EAGLE-1,2,3](https://github.com/SafeAILab/EAGLE)\n- [Hydra](https://github.com/zankner/hydra)\n- [Medusa](https://sites.google.com/view/medusa-llm)\n- [Speculative Sampling](https://huggingface.co/blog/assisted-generation)\n- [Prompt Lookup Decoding](https://github.com/apoorvumang/prompt-lookup-decoding)\n- [TokenRecycling](https://github.com/Luowaterbi/TokenRecycling)\n- [REST](https://sites.google.com/view/rest-llm/)\n- [Lookahead Decoding](https://lmsys.org/blog/2023-11-21-lookahead-decoding/)\n- [SPACE](https://github.com/cteant/SPACE)\n- [SAM-Decoding](https://github.com/hyx1999/SAM-Decoding)\n\nAlongside the stable version, Spec-Bench now supports the latest transformers with multiple speculative decoding methods, including the SpS and Eagle series. For details, check out the [latest-transformer](https://github.com/hemingkx/Spec-Bench/tree/latest-transformer) branch.\n\n## Update\n\n**2025.04.22**: We have updated [Leaderboard](https://github.com/hemingkx/Spec-Bench/blob/main/Leaderboard.md) in Spec-Bench.\n\n**2025.03.23**: We have integrated [EAGLE-3](https://github.com/SafeAILab/EAGLE) into Spec-Bench.\n\n**2025.03.18**: We have integrated [SAM-Decoding](https://github.com/hyx1999/SAM-Decoding) into Spec-Bench.\n\n**2024.10.25**: We have integrated [EAGLE-2](https://github.com/SafeAILab/EAGLE) into Spec-Bench.\n\n**2024.05.29**: We have integrated [SPACE](https://github.com/cteant/SPACE) into Spec-Bench.\n\n**2024.05.16**: Our [paper](https://arxiv.org/abs/2401.07851) has been accepted by ACL 2024 Findings 🎉 !\n\n**2024.03.12**: We now support statistics for [#Mean accepted tokens](https://github.com/hemingkx/Spec-Bench/blob/main/evaluation/speed.py#L65).\n\n**2024.03.11**: We have integrated [Hydra](https://github.com/zankner/hydra) into Spec-Bench, check it out!\n\n## Installation\n\n```\nconda create -n specbench python=3.12\nconda activate specbench\ncd Spec-Bench\npip install -r requirements.txt\n```\n\n## Model Weights\n\nDownload corresponding model weights (if required) and modify the checkpoint path in `eval.sh`.\n\n- [vicuna-v1.3](https://huggingface.co/lmsys/vicuna-7b-v1.3)\n- [EAGLE-1,3](https://github.com/SafeAILab/EAGLE?tab=readme-ov-file#eagle-weights)\n- [Hydra](https://github.com/zankner/hydra?tab=readme-ov-file#model-weights)\n- [Medusa-1](https://github.com/FasterDecoding/Medusa?tab=readme-ov-file#medusa-1)\n- [Speculative Sampling](https://github.com/NJUNLP/MCSD?tab=readme-ov-file#model-release)\n- [SPACE](https://huggingface.co/AntMan/vicuna-v1.3-7b-space)\n\n## Additonal Setup\n\n#### REST (Optional)\n\n##### Build DraftRetriever from source\n\n```\ncd model/rest/DraftRetriever\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\nmaturin build --release --strip -i python3.12 # will produce a .whl file\npip3 install ./target/wheels/draftretriever-0.1.0-cp312-cp312-linux_x86_64.whl\n```\n\n##### Create a datastore\n\n```\ncd model/rest/datastore\n./datastore.sh # modify your own path\n```\n\n## Inference\n\nSelect specific command line in `eval.sh`, the results will be stored in `data/spec_bench/model_answer/`.\n\n```\n./eval.sh\n```\n\n\u003e We also provide an automatic evaluation script in `./scripts` for your reference.\n\n## Speedup Report\n\nObtain the corresponding speedup compared to vanilla autoregressive decoding.\n\n```\npython evaluation/speed.py --file-path /your_own_path/eagle.jsonl --base-path /your_own_path/vicuna.jsonl\n```\n\n## Result Comparison\n\nExamine whether the generated results are equal to autoregressive decoding or not.\n\n```\npython evaluation/equal.py --file-path /your_own_path/model_answer/ --jsonfile1 vicuna.jsonl --jsonfile2 eagle.jsonl\n```\n\n## Contributing\n\nWe warmly welcome contributions and discussions related to Spec-Bench! If you have any suggestions for improvements or ideas you'd like to discuss, please don't hesitate to open an issue. This will allow us to collaborate and discuss your ideas in detail.\n\n***More models are welcome!*** - If you're aware of any open-source Speculative Decoding methods not currently included in Spec-Bench, we encourage you to contribute by submitting a pull request. This helps ensure Spec-Bench remains a comprehensive and fair benchmarking platform for comparing existing methods. Please ensure that your changes are well-tested before submission.\n\n## Acknowledgments\n\nThis codebase is built from [Medusa](https://github.com/FasterDecoding/Medusa) and [EAGLE](https://github.com/SafeAILab/EAGLE). We integrated code implementations of multiple open-source Speculative Decoding methods to facilitate unified evaluation.\n\n## Citation\n\nIf you find the resources in this repository useful, please cite our paper:\n\n```\n@inproceedings{xia-etal-2024-unlocking,\n    title = \"Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding\",\n    author = \"Xia, Heming and Yang, Zhe and Dong, Qingxiu and Wang, Peiyi and Li, Yongqi  and Ge, Tao and Liu, Tianyu and Li, Wenjie and Sui, Zhifang\",\n    editor = \"Ku, Lun-Wei and Martins, Andre and Srikumar, Vivek\",\n    booktitle = \"Findings of the Association for Computational Linguistics ACL 2024\",\n    month = aug,\n    year = \"2024\",\n    address = \"Bangkok, Thailand and virtual meeting\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2024.findings-acl.456\",\n    doi = \"10.18653/v1/2024.findings-acl.456\",\n    pages = \"7655--7671\",\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhemingkx%2FSpec-Bench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhemingkx%2FSpec-Bench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhemingkx%2FSpec-Bench/lists"}