https://github.com/pchsu-hsupc/edge_ai_13th

This project optimizes the LLaMA-3.2B-Instruct model for fast inference on a single NVIDIA T4 GPU (16 GB), targeting high throughput and low perplexity for efficient edge deployment.
https://github.com/pchsu-hsupc/edge_ai_13th

gguf llama-cpp-python llama3 lora

Last synced: 4 months ago
JSON representation

This project optimizes the LLaMA-3.2B-Instruct model for fast inference on a single NVIDIA T4 GPU (16 GB), targeting high throughput and low perplexity for efficient edge deployment.

Host: GitHub
URL: https://github.com/pchsu-hsupc/edge_ai_13th
Owner: pchsu-hsupc
Created: 2025-05-06T07:49:03.000Z (6 months ago)
Default Branch: master
Last Pushed: 2025-06-04T15:25:47.000Z (5 months ago)
Last Synced: 2025-06-04T17:50:50.187Z (5 months ago)
Topics: gguf, llama-cpp-python, llama3, lora
Language: Python
Homepage:
Size: 19.5 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pchsu-hsupc/edge_ai_13th

Awesome Lists containing this project