https://github.com/pogzyb/llmabda
Run llama.cpp server on aws lambda for cheap
https://github.com/pogzyb/llmabda
agentic-ai inference inference-api inference-engine lambda-functions llamacpp
Last synced: 4 months ago
JSON representation
Run llama.cpp server on aws lambda for cheap
- Host: GitHub
- URL: https://github.com/pogzyb/llmabda
- Owner: pogzyb
- License: mit
- Created: 2025-10-23T00:12:34.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-12-03T14:09:43.000Z (6 months ago)
- Last Synced: 2025-12-05T22:55:12.428Z (6 months ago)
- Topics: agentic-ai, inference, inference-api, inference-engine, lambda-functions, llamacpp
- Language: HCL
- Homepage:
- Size: 174 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## LLMABDA
[](https://github.com/pogzyb/llmabda/actions/workflows/release.yml)
🐑 Run Quantized Agents on AWS Lambda for Cheap 🐥
### What is this?
LLMabda is a simple proxy for [llama.cpp](https://github.com/ggml-org/llama.cpp) server that runs on AWS Lambda.
### Deployment
Information about deployment is in [deploy/](/deploy/DEPLOY.md)
### Demo
1. Agent (Using ggml-org/Qwen3-1.7B-GGUF)
- [source code](https://github.com/pogzyb/llmabda/blob/main/examples/react_agent_simple.py)
- "What is 5 + (5 * 6)? Please shout the final answer."

2. Summarize text (Using ggml-org/Qwen3-1.7B-GGUF)
- [source code](https://github.com/pogzyb/llmabda/blob/main/examples/task_summarize_osrs.py)
- "What happened to the Crystal Extractor?"
