https://github.com/sleepingcat4/inference-decoding
continuously updating with my fav. fastest llm inference techniques and all are tested on supercomputer leonardo
https://github.com/sleepingcat4/inference-decoding
hpc-clusters huggingface-transformers llm-inference
Last synced: 4 months ago
JSON representation
continuously updating with my fav. fastest llm inference techniques and all are tested on supercomputer leonardo
- Host: GitHub
- URL: https://github.com/sleepingcat4/inference-decoding
- Owner: sleepingcat4
- License: mit
- Created: 2025-01-02T15:04:26.000Z (6 months ago)
- Default Branch: master
- Last Pushed: 2025-01-07T03:28:44.000Z (6 months ago)
- Last Synced: 2025-01-23T05:20:13.501Z (5 months ago)
- Topics: hpc-clusters, huggingface-transformers, llm-inference
- Language: Python
- Homepage:
- Size: 26.4 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- License: LICENSE