Projects in Awesome Lists tagged with latency-optimization
A curated list of projects in awesome lists tagged with latency-optimization .
https://github.com/avilum/minrlm
Stop forcing LLMs to answer in one pass. Give them a runtime. Recursive Language Model that improves any LLM, while reducing token usage up to 4X.
agent ai-agents cost-optimization latency-optimization llm llm-inference llmops recursive-language-model rlm token-optimization
Last synced: 07 Apr 2026
https://github.com/umitkacar/onnx-tensorrt-optimization
40x faster AI inference: ONNX to TensorRT optimization with FP16/INT8 quantization, multi-GPU support, and deployment
cuda deep-learning edge-computing fp16 gpu-acceleration inference-acceleration int8 latency-optimization mlops model-deployment model-optimization nvidia-gpu onnx onnxruntime production-ai pytorch-to-onnx quantization real-time-inference tensorflow-to-onnx tensorrt
Last synced: 18 Feb 2026
https://github.com/cepdnaclk/e19-4yp-ai-dirven-latency-constrained-resource-management-in-kubernetes
This repo focuses on latency-aware resource optimization for Kubernetes
cloud-environments cold-start-mitigation distributed-tracing kubernetes latency-optimization load-balancing metrics-collection microservices online-learning penalty-based-optimization performance-testing prediction-models realtime-monitoring reinforcement-learning resource-allocation scaling-methods service-dependency system-performance traffic-simulation workload-profiling
Last synced: 17 Jun 2025
https://github.com/north-shore-ai/crucible_hedging
Request hedging for tail latency reduction in distributed systems
ai beam distributed-systems elixir ensemble-methods hedging latency-optimization latency-reduction llm machine-learning otp performance-optimization reliability request-hedging research statistical-testing tail-latency
Last synced: 21 Oct 2025