https://github.com/tom-doerr/llm_api_testing

Last synced: 9 months ago
JSON representation

Host: GitHub
URL: https://github.com/tom-doerr/llm_api_testing
Owner: tom-doerr
License: mit
Created: 2025-01-11T02:57:37.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-06T19:21:06.000Z (over 1 year ago)
Last Synced: 2025-06-19T10:15:39.045Z (12 months ago)
Language: Python
Size: 742 KB
Stars: 12
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

Deepseek Performance Monitoring

A performance monitoring solution for Deepseek API using LiteLLM

## Results
![Performance Plot](performance_results/performance_plot.png)

## 🚀 Features
- Measures response latency in milliseconds
- Calculates tokens processed per second
- Configurable monitoring intervals
- Comprehensive CSV logging
- Adjustable test duration
- Support for multiple Deepseek models
- Random prompt generation with configurable distribution

## 🛠️ Usage

1. Ensure LiteLLM is installed and configured
2. Set your Deepseek API key as an environment variable:
```bash
export DEEPSEEK_API_KEY='your_api_key_here'
```

3. Run the script with desired options:
```bash
# Basic usage with default settings
python3 deepseek_performance_monitor.py

# Run for 1 week with 30-second intervals
python3 deepseek_performance_monitor.py --duration 168 --interval 30

# Custom output file and increased reasoner model usage
python3 deepseek_performance_monitor.py --output custom_results.csv --reasoner-ratio 0.2
```

### Command-line Options
- `--duration`: Test duration in hours (default: 24)
- `--interval`: Time between requests in seconds (default: 60)
- `--output`: Output CSV file path (default: deepseek_performance.csv)
- `--reasoner-ratio`: Probability of using deepseek-reasoner model (default: 0.1)

## 📈 Output

The script creates a CSV file `deepseek_performance.csv` with the following columns:
- timestamp: Measurement time
- first_token_latency_ms: Time to first token in milliseconds
- total_latency_ms: Total response time in milliseconds
- tokens_per_second: Tokens processed per second
- completion_tokens: Total tokens in the response (completion tokens)
- prompt_tokens: Number of tokens in the prompt

## 📋 Example Output

### Performance Statistics
```plaintext
Average TPS: 45.96
Max TPS: 47.43
Min TPS: 44.67

Average First Token Latency: 1030.93 ms
Max First Token Latency: 1184.10 ms
Min First Token Latency: 794.79 ms

Average Total Latency: 17826.14 ms
Max Total Latency: 20675.07 ms
Min Total Latency: 14632.69 ms

Total Completion Tokens Processed: 10657
Total Requests: 13
```
```
2025-01-11 04:02:02 - Latency: 1550.53ms, TPS: 9.67, Tokens: 15
2025-01-11 04:03:04 - Latency: 1317.85ms, TPS: 11.38, Tokens: 15
2025-01-11 04:04:05 - Latency: 1375.23ms, TPS: 10.91, Tokens: 15
```

## 📦 Requirements
- Python 3
- LiteLLM
- Deepseek API key

## 📝 Notes
- Errors are logged to console and CSV but don't stop execution
- Results are saved to CSV for later analysis
- Supports both deepseek-chat and deepseek-reasoner models
- Random prompt generation ensures diverse testing scenarios
- Error rate tracking excludes context size exceeded errors

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tom-doerr/llm_api_testing

Awesome Lists containing this project

README

Deepseek Performance Monitoring