https://github.com/smpanaro/token-recycling
Unofficial implementation of Token Recycling self-speculative decoding method.
https://github.com/smpanaro/token-recycling
large-language-model llm-inference speculative-decoding
Last synced: 24 days ago
JSON representation
Unofficial implementation of Token Recycling self-speculative decoding method.
- Host: GitHub
- URL: https://github.com/smpanaro/token-recycling
- Owner: smpanaro
- Created: 2024-11-21T05:16:34.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-21T05:35:25.000Z (over 1 year ago)
- Last Synced: 2024-11-21T06:24:01.039Z (over 1 year ago)
- Topics: large-language-model, llm-inference, speculative-decoding
- Language: Python
- Homepage: https://arxiv.org/abs/2408.08696
- Size: 611 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md