https://github.com/Bestpay-inc/Falcon
Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree
https://github.com/Bestpay-inc/Falcon
Last synced: 15 days ago
JSON representation
Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree
- Host: GitHub
- URL: https://github.com/Bestpay-inc/Falcon
- Owner: Bestpay-inc
- License: apache-2.0
- Created: 2025-04-21T07:29:15.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-21T07:40:26.000Z (about 1 year ago)
- Last Synced: 2025-04-21T08:37:01.534Z (about 1 year ago)
- Language: Python
- Size: 1.53 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
Awesome Lists containing this project
- awesomeopd - Falcon - inc/Falcon?style=for-the-badge&logo=github&logoColor=white&labelColor=181717&color=ffd700" alt="Stars"> | 2024.12 | Bestpay | [arXiv 2412.12639](https://arxiv.org/abs/2412.12639) | Falcon | (⚡ Speculative-Decoding Distillation / 🔁 Iterative Self-Bootstrapping)