{"id":15136490,"url":"https://github.com/evilfreelancer/docker-whisper-server","last_synced_at":"2025-10-23T11:31:34.682Z","repository":{"id":249388146,"uuid":"831344266","full_name":"EvilFreelancer/docker-whisper-server","owner":"EvilFreelancer","description":"whisper.cpp HTTP transcription server with OpenAI-like API in Docker","archived":false,"fork":false,"pushed_at":"2025-01-29T13:53:17.000Z","size":869,"stargazers_count":15,"open_issues_count":2,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-30T18:06:08.836Z","etag":null,"topics":["api","api-server","asr","cuda","docker","docker-compose","dockerfile","nvidia","openai","openai-api","whisper","whisper-cpp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EvilFreelancer.png","metadata":{"files":{"readme":"README.en.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-20T09:26:04.000Z","updated_at":"2025-01-29T13:53:21.000Z","dependencies_parsed_at":"2024-12-29T16:20:47.060Z","dependency_job_id":"1c12282a-9d76-4ca5-9a17-b7ba5707ae81","html_url":"https://github.com/EvilFreelancer/docker-whisper-server","commit_stats":{"total_commits":50,"total_committers":1,"mean_commits":50.0,"dds":0.0,"last_synced_commit":"937d6a65050cf0122bd399c75425beec6f51af29"},"previous_names":["evilfreelancer/docker-whisper-server"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-whisper-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-whisper-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-whisper-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-whisper-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EvilFreelancer","download_url":"https://codeload.github.com/EvilFreelancer/docker-whisper-server/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237821569,"owners_count":19371786,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","api-server","asr","cuda","docker","docker-compose","dockerfile","nvidia","openai","openai-api","whisper","whisper-cpp"],"created_at":"2024-09-26T06:22:09.680Z","updated_at":"2025-10-23T11:31:34.675Z","avatar_url":"https://github.com/EvilFreelancer.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Whisper.cpp API Webserver in Docker\n\n[Русский](./README.md) | [中文](./README.zh.md) | **English**\n\nWhisper.cpp HTTP transcription server with OAI-like API in Docker.\n\nThis project provides a Dockerized transcription server based\non [whisper.cpp](https://github.com/ggerganov/whisper.cpp/tree/master/examples/server).\n\n## Features\n\n- Dockerized whisper.cpp HTTP server for audio transcription\n- Configurable via environment variables\n- Automatically converts audio to WAV format\n- Automatically downloads required model on startup\n- Can quantize any Whisper model to the required type on startup\n\n## Requirements\n\nBefore you begin, ensure you have a machine with an GPU that supports modern CUDA, due to the computational\ndemands of the docker image.\n\n* Nvidia GPU / Intel Arc\n* CUDA / oneAPI\n* Docker\n* Docker Compose\n* Nvidia Docker Runtime (Nvidia only)\n\nFor detailed instructions on how to prepare a Linux machine for running neural networks, including the installation of\nCUDA, Docker, and Nvidia Docker Runtime, please refer to the\npublication \"[How to Prepare Linux for Running and Training Neural Networks? (+ Docker)](https://dzen.ru/a/ZVt9kRBCTCGlQqyP)\"\non Russian.\n\n## Installation\n\n1. Clone the repo and switch to sources root:\n\n   ```shell\n   git clone https://github.com/EvilFreelancer/docker-whisper-server.git\n   cd docker-whisper-server\n   ```\n\n2. Copy the provided Docker Compose template:\n\n    ```shell\n    cp docker-compose.dist.yml docker-compose.yml\n    ```\n\nExample for Intel Arc cards:\n\n```yaml\nx-shared-logs: \u0026shared-logs\n   logging:\n      driver: \"json-file\"\n      options:\n         max-size: \"10k\"\n\nservices:\n  whisper-intel:\n    restart: \"unless-stopped\"\n    build:\n      context: ./whisper\n      dockerfile: Dockerfile.intel  \n      args:\n        - WHISPER_VERSION=v1.7.4\n    devices:\n       - /dev/dri\n    volumes:\n      - ./models:/app/models\n    ports:\n      - \"127.0.0.1:9000:9000\"\n    environment:\n      WHISPER_MODEL: large-v3-turbo\n      WHISPER_MODEL_QUANTIZATION: q4_0\n    \u003c\u003c: *shared-logs\n```\n\n3. Build the Docker image:\n\n    ```shell\n    docker-compose build\n    ```\n\n4. Start the services:\n\n    ```shell\n    docker-compose up -d\n    ```\n\n5. Navigate to http://localhost:8080 in browser:\n\n   ![Swagger UI](./assets/swagger.png)\n\n## Endpoints\n\n### /inference\n\nTranscribe an audio file:\n\n```shell\ncurl 127.0.0.1:9000/inference \\\n  -H \"Content-Type: multipart/form-data\" \\\n  -F file=\"@\u003cfile-path\u003e\" \\\n  -F temperature=\"0.0\" \\\n  -F temperature_inc=\"0.2\" \\\n  -F response_format=\"json\"\n```\n\n### /load\n\nLoad a new Whisper model:\n\n```shell\ncurl 127.0.0.1:9000/load \\\n   -H \"Content-Type: multipart/form-data\" \\\n   -F model=\"\u003cpath-to-model-file-in-docker-container\u003e\"\n```\n\n## Environment variables\n\n**Basic configuration**\n\n| Name                         | Default                               | Description                                                                      |\n|------------------------------|---------------------------------------|----------------------------------------------------------------------------------|\n| `WHISPER_MODEL`              | base.en                               | The default Whisper model to use                                                 |\n| `WHISPER_MODEL_PATH`         | /app/models/ggml-${WHISPER_MODEL}.bin | The default path to the Whisper model file                                       |\n| `WHISPER_MODEL_QUANTIZATION` |                                       | Level of quantization (will be applied only if `WHISPER_MODEL_PATH` not changed) |\n\n\u003cdetails\u003e\n\u003csummary\u003e\n\u003ci\u003eAdvanced Configuration\u003c/i\u003e\n\u003c/summary\u003e\n\n| Name                      | Default    | Description                                         |\n|---------------------------|------------|-----------------------------------------------------|\n| `WHISPER_THREADS`         | 4          | Number of threads to use for inference              |\n| `WHISPER_PROCESSORS`      | 1          | Number of processors to use for inference           |\n| `WHISPER_HOST`            | 0.0.0.0    | Host IP or hostname to bind the server to           |\n| `WHISPER_PORT`            | 9000       | Port number to listen on                            |\n| `WHISPER_INFERENCE_PATH`  | /inference | Inference path for all requests                     |\n| `WHISPER_PUBLIC_PATH`     |            | Path to the public folder                           |\n| `WHISPER_REQUEST_PATH`    |            | Request path for all requests                       |\n| `WHISPER_OV_E_DEVICE`     | CPU        | OpenViBE Event Device to use                        |\n| `WHISPER_OFFSET_T`        | 0          | Time offset in milliseconds                         |\n| `WHISPER_OFFSET_N`        | 0          | Number of seconds to offset                         |\n| `WHISPER_DURATION`        | 0          | Duration of the audio file in milliseconds          |\n| `WHISPER_MAX_CONTEXT`     | -1         | Maximum context size for inference                  |\n| `WHISPER_MAX_LEN`         | 0          | Maximum length of output text                       |\n| `WHISPER_BEST_OF`         | 2          | Best-of-N strategy for inference                    |\n| `WHISPER_BEAM_SIZE`       | -1         | Beam size for search                                |\n| `WHISPER_AUDIO_CTX`       | 0          | Audio context to use for inference                  |\n| `WHISPER_WORD_THOLD`      | 0.01       | Word threshold for segmentation                     |\n| `WHISPER_ENTROPY_THOLD`   | 2.40       | Entropy threshold for segmentation                  |\n| `WHISPER_LOGPROB_THOLD`   | -1.00      | Log probability threshold for segmentation          |\n| `WHISPER_LANGUAGE`        | en         | Language code to use for translation or diarization |\n| `WHISPER_PROMPT`          |            | Initial prompt                                      |\n| `WHISPER_DTW`             |            | Compute token-level timestamps                      |\n| `WHISPER_CONVERT`         | true       | Convert audio to WAV, requires ffmpeg on the server |\n| `WHISPER_SPLIT_ON_WORD`   | false      | Split on word rather than on token                  |\n| `WHISPER_DEBUG_MODE`      | false      | Enable debug mode                                   |\n| `WHISPER_TRANSLATE`       | false      | Translate from source language to english           |\n| `WHISPER_DIARIZE`         | false      | Stereo audio diarization                            |\n| `WHISPER_TINYDIARIZE`     | false      | Enable tinydiarize (requires a tdrz model)          |\n| `WHISPER_NO_FALLBACK`     | false      | Do not use temperature fallback while decoding      |\n| `WHISPER_PRINT_SPECIAL`   | false      | Print special tokens                                |\n| `WHISPER_PRINT_COLORS`    | false      | Print colors                                        |\n| `WHISPER_PRINT_REALTIME`  | false      | Print output in realtime                            |\n| `WHISPER_PRINT_PROGRESS`  | false      | Print progress                                      |\n| `WHISPER_NO_TIMESTAMPS`   | false      | Do not print timestamps                             |\n| `WHISPER_DETECT_LANGUAGE` | false      | Exit after automatically detecting language         |\n\n\u003c/details\u003e\n\n## Links\n\n- [whisper.cpp](https://github.com/ggerganov/whisper.cpp)\n- [server example](https://github.com/ggerganov/whisper.cpp/tree/master/examples/server) of whisper.cpp\n\n## Citing\n\n```text\n[Pavel Rykov]. (2024). Whisper.cpp API Webserver in Docker. GitHub. https://github.com/EvilFreelancer/docker-whisper-server\n```\n\n```text\n@misc{pavelrykov2024whisperapi,\n  author = {Pavel Rykov},\n  title  = {Whisper.cpp API Webserver in Docker},\n  year   = {2024},\n  url    = {https://github.com/EvilFreelancer/docker-whisper-server}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilfreelancer%2Fdocker-whisper-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevilfreelancer%2Fdocker-whisper-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilfreelancer%2Fdocker-whisper-server/lists"}