Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nptt9/illama
A fast, lightweight, parallel inference server for Llama LLMs.
https://github.com/nptt9/illama
exllama exllamav2 flash-attention-2 inference llama llama2 llama3 llm-inference paged-attention server
Last synced: 11 days ago
JSON representation
A fast, lightweight, parallel inference server for Llama LLMs.
- Host: GitHub
- URL: https://github.com/nptt9/illama
- Owner: nptt9
- License: mit
- Created: 2024-05-23T05:37:08.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-07-30T14:33:37.000Z (3 months ago)
- Last Synced: 2024-10-09T05:33:51.789Z (29 days ago)
- Topics: exllama, exllamav2, flash-attention-2, inference, llama, llama2, llama3, llm-inference, paged-attention, server
- Language: Python
- Homepage:
- Size: 45.9 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE