https://github.com/sleepingcat4/inference-decoding

continuously updating with my fav. fastest llm inference techniques and all are tested on supercomputer leonardo
https://github.com/sleepingcat4/inference-decoding

hpc-clusters huggingface-transformers llm-inference

Last synced: 4 months ago
JSON representation

continuously updating with my fav. fastest llm inference techniques and all are tested on supercomputer leonardo

Host: GitHub
URL: https://github.com/sleepingcat4/inference-decoding
Owner: sleepingcat4
License: mit
Created: 2025-01-02T15:04:26.000Z (6 months ago)
Default Branch: master
Last Pushed: 2025-01-07T03:28:44.000Z (6 months ago)
Last Synced: 2025-01-23T05:20:13.501Z (5 months ago)
Topics: hpc-clusters, huggingface-transformers, llm-inference
Language: Python
Homepage:
Size: 26.4 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- License: LICENSE

Awesome Lists containing this project