https://github.com/redhat-et/triton-cache-performance-comparison
https://github.com/redhat-et/triton-cache-performance-comparison
amd-gpu cache cuda gpu nvidia-gpu performance rocm triton
Last synced: 15 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/redhat-et/triton-cache-performance-comparison
- Owner: redhat-et
- License: apache-2.0
- Created: 2025-03-06T09:03:11.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-06T09:04:37.000Z (about 1 year ago)
- Last Synced: 2025-03-06T10:22:36.580Z (about 1 year ago)
- Topics: amd-gpu, cache, cuda, gpu, nvidia-gpu, performance, rocm, triton
- Language: Python
- Homepage:
- Size: 338 KB
- Stars: 0
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Triton Cache Performance Comparison

*CUDA: Triton cache significantly improves startup performance*

*ROCm: Triton cache significantly improves startup performance*
## Proof of Concept
This benchmark compares GPU memory usage and startup performance of Triton kernels in two scenarios:
1. **With Triton cache pre-loaded** - Cache exists from previous run
2. **Without Triton cache** - Clean cache state
Key findings:
- Triton cache significantly reduces startup time
- More consistent memory usage patterns with cached kernels
- Improved resource utilization during initial model loading
## Prerequisites
### Hardware Requirements
- NVIDIA GPU (CUDA) or AMD GPU (ROCm)
## Usage
### Basic Benchmark
```bash
./benchmark.sh --arch [cuda|rocm]
```
### Advanced Options
```bash
# Custom cache location and script
./benchmark.sh \
--arch cuda \
--triton-cache-dir ~/alternate_cache \
--script ./custom_script.py
```
### Expected Output
1. `gpu_usage_log.csv` - Time-series memory data
2. `gpu_memory_usage_comparison.png` - Visualization plot
## Technical Details
### Benchmark Process
1. **Cold Start** (no cache):
- Purge existing Triton cache
- Run script
- Log GPU memory at 1Hz frequency
2. **Warm Start** (with cache):
- Reuse generated kernels
- Run identical script
- Compare memory/time metrics
### Key Configuration
```bash
export TRITON_CACHE_DIR="~/.triton/cache" # Default cache location
```
## License
Apache 2.0 [LICENSE](LICENSE)