{"id":43116958,"url":"https://github.com/modal-labs/stopwatch","last_synced_at":"2026-01-31T19:05:30.660Z","repository":{"id":286687511,"uuid":"962222412","full_name":"modal-labs/stopwatch","owner":"modal-labs","description":"A tool for benchmarking LLMs on Modal","archived":false,"fork":false,"pushed_at":"2025-08-25T18:44:13.000Z","size":2192,"stargazers_count":42,"open_issues_count":0,"forks_count":4,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-25T20:30:47.937Z","etag":null,"topics":["llms","machine-learning","sglang","tensorrt-llm","vllm"],"latest_commit_sha":null,"homepage":"https://modal.com/llm-almanac","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/modal-labs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-07T20:45:02.000Z","updated_at":"2025-08-25T18:44:16.000Z","dependencies_parsed_at":"2025-05-03T10:20:21.528Z","dependency_job_id":"dc34e8c9-a22c-4c75-989e-8fd54508338a","html_url":"https://github.com/modal-labs/stopwatch","commit_stats":null,"previous_names":["modal-labs/stopwatch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/modal-labs/stopwatch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modal-labs%2Fstopwatch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modal-labs%2Fstopwatch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modal-labs%2Fstopwatch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modal-labs%2Fstopwatch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/modal-labs","download_url":"https://codeload.github.com/modal-labs/stopwatch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modal-labs%2Fstopwatch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28950361,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-31T18:30:42.805Z","status":"ssl_error","status_checked_at":"2026-01-31T18:30:19.593Z","response_time":128,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llms","machine-learning","sglang","tensorrt-llm","vllm"],"created_at":"2026-01-31T19:05:29.991Z","updated_at":"2026-01-31T19:05:30.654Z","avatar_url":"https://github.com/modal-labs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# stopwatch\n\n_A simple solution for benchmarking [vLLM](https://docs.vllm.ai/en/latest/), [SGLang](https://docs.sglang.ai/), and [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) on [Modal](https://modal.com/)._ ⏱️\n\n## Setup\n\n### Install dependencies\n\n```bash\npip install -e .\n```\n\n## Run a benchmark\n\nTo run a single benchmark, you can use the `provision-and-benchmark` command, which will provision an LLM server, benchmark it, and save the results to a local file.\nFor example, to run a synchronous (one request after another) benchmark with vLLM and save the results to `results.json`:\n\n```bash\nLLM_SERVER_TYPE=vllm\nMODEL=meta-llama/Llama-3.1-8B-Instruct\nOUTPUT_PATH=results.json\n\nstopwatch provision-and-benchmark $MODEL $LLM_SERVER_TYPE --output-path $OUTPUT_PATH\n```\n\nOr, to run a fixed-rate (e.g. 5 requests per second) multi-GPU benchmark with SGLang:\n\n```bash\nGPU_COUNT=4\nGPU_TYPE=H100\nLLM_SERVER_TYPE=sglang\nRATE_TYPE=constant\nREQUESTS_PER_SECOND=5\n\nstopwatch provision-and-benchmark $MODEL $LLM_SERVER_TYPE --output-path $OUTPUT_PATH --gpu \"$GPU_TYPE:$GPU_COUNT\" --rate-type $RATE_TYPE --rate $REQUESTS_PER_SECOND --llm-server-config \"{\\\"extra_args\\\": [\\\"--tp-size\\\", \\\"$GPU_COUNT\\\"]}\"\n```\n\nOr, to run a throughput (as many requests as the server can handle) test with TensorRT-LLM:\n\n```bash\nLLM_SERVER_TYPE=tensorrt-llm\nRATE_TYPE=throughput\n\nstopwatch provision-and-benchmark $MODEL $LLM_SERVER_TYPE --output-path $OUTPUT_PATH --rate-type $RATE_TYPE\n```\n\n## Run the profiler\n\nTo profile a server with the PyTorch profiler, use the following command (only vLLM and SGLang are currently supported):\n\n```bash\nLLM_SERVER_TYPE=vllm\nMODEL=meta-llama/Llama-3.1-8B-Instruct\nNUM_REQUESTS=10\nOUTPUT_PATH=trace.json.gz\n\nstopwatch profile $MODEL $LLM_SERVER_TYPE --output-path $OUTPUT_PATH --num-requests $NUM_REQUESTS\n```\n\nOnce the profiling is done, the trace will be saved to `trace.json.gz`, which you can open and visualize at [https://ui.perfetto.dev](https://ui.perfetto.dev).\nKeep in mind that generated traces can get very large, so it is recommended to only send a few requests while profiling.\n\n## Run tests\n\nBefore committing any changes, you should make sure that your changes don't break any core functionality in Stopwatch.\nYou may verify this with:\n\n```bash\npytest\n```\n\n### Lint\n\nTo make sure that any code changes are compliant with our linting rules, you can run `ruff` with:\n\n```bash\nruff check\n```\n\n## Contributing\n\nWe welcome contributions, including those that add tuned benchmarks to our collection.\nSee the [CONTRIBUTING](/CONTRIBUTING.md) file and the [Getting Started](https://github.com/modal-labs/big-benchmark/wiki/Getting-Started) document for more details on contributing to Stopwatch.\n\n## License\n\nStopwatch is available under the MIT license. See the [LICENSE](/LICENSE.md) file for more details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmodal-labs%2Fstopwatch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmodal-labs%2Fstopwatch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmodal-labs%2Fstopwatch/lists"}