An open API service indexing awesome lists of open source software.

https://github.com/pchsu-hsupc/edge_ai_13th

This project optimizes the LLaMA-3.2B-Instruct model for fast inference on a single NVIDIA T4 GPU (16 GB), targeting high throughput and low perplexity for efficient edge deployment.
https://github.com/pchsu-hsupc/edge_ai_13th

gguf llama-cpp-python llama3 lora

Last synced: 4 months ago
JSON representation

This project optimizes the LLaMA-3.2B-Instruct model for fast inference on a single NVIDIA T4 GPU (16 GB), targeting high throughput and low perplexity for efficient edge deployment.

Awesome Lists containing this project