https://github.com/aleph-alpha/locust-sse

Locust plugin for SSE (useful for loadtesting LLMs)
https://github.com/aleph-alpha/locust-sse

llm llm-loadtesting loadtesting locust locust-plugin plugin sse sse-loadtesting

Last synced: 4 months ago
JSON representation

Locust plugin for SSE (useful for loadtesting LLMs)

Host: GitHub
URL: https://github.com/aleph-alpha/locust-sse
Owner: Aleph-Alpha
License: mit
Created: 2025-12-09T11:33:23.000Z (7 months ago)
Default Branch: main
Last Pushed: 2025-12-12T09:34:02.000Z (7 months ago)
Last Synced: 2025-12-22T09:38:45.821Z (6 months ago)
Topics: llm, llm-loadtesting, loadtesting, locust, locust-plugin, plugin, sse, sse-loadtesting
Language: Python
Homepage:
Size: 154 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

README

          # Locust SSE User

A Locust plugin for testing Server-Sent Events (SSE) endpoints, specifically designed for LLM streaming response benchmarking.

## Installation

You can install this package using `uv` (recommended) or `pip`.

### Using uv

```bash

uv add locust-sse

```

### Using pip

```bash

pip install locust-sse

```

## Usage

Inherit from `SSEUser` in your `locustfile.py` and use the `handle_sse_request` method to make SSE requests.

```python

from locust import task

from locust_sse import SSEUser

class MyLLMUser(SSEUser):

    # Set the host for the user

    host = "http://localhost:8080"

    @task

    def chat(self):

        # Example payload for a chat completion endpoint

        payload = {

            "model": "gpt-4",

            "messages": [

                {"role": "user", "content": "Tell me a joke."}

            ],

            "stream": True

        }

        # Make the SSE request

        self.handle_sse_request(

            url="/chat/completions",

            params={"json": payload},

            prompt="Tell me a joke.",

            request_name="chat_completion"

        )

```

## Metrics

This plugin automatically tracks specific metrics relevant to LLM streaming performance and reports them to Locust.

| Metric | Description |

| :--- | :--- |

| **TTFT** | **Time To First Token**. Measures the latency from the start of the request until the first "append" event is received. |

| **Prompt Tokens** | Number of tokens in the input prompt (estimated). |

| **Completion Tokens** | Number of tokens in the generated response (estimated). |

| **Processing Time** | Total time taken for the entire generation process. |

### How Metrics Appear in Locust

These metrics are reported as separate entries in the Locust statistics table:

- `{request_name}_ttft`: Latency statistics for the first token.

- `{request_name}_prompt_tokens`: "Response Length" column shows token count.

- `{request_name}_completion_tokens`: "Response Length" column shows token count.

- `{request_name}`: The main request entry showing total duration.

## Development

This project uses `uv` for dependency management.

```bash

# Install dependencies

uv sync

# Run tests

uv run pytest

```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/aleph-alpha/locust-sse

Awesome Lists containing this project

README