{"id":48986319,"url":"https://github.com/GeiserX/whisper-subs","last_synced_at":"2026-05-04T16:01:04.732Z","repository":{"id":327805236,"uuid":"1110874237","full_name":"GeiserX/whisper-subs","owner":"GeiserX","description":"Jellyfin plugin for local AI-powered subtitle generation using Whisper - all processing stays on your server","archived":false,"fork":false,"pushed_at":"2026-04-25T12:32:46.000Z","size":454,"stargazers_count":33,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-04-25T14:23:09.116Z","etag":null,"topics":["accessibility","ai","automation","csharp","dotnet","hacktoberfest","homelab","jellyfin","jellyfin-plugin","media-management","media-server","open-source","parakeet","self-hosted","speech-to-text","stt","subtitles","transcription","whisper","whisper-cpp"],"latest_commit_sha":null,"homepage":null,"language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GeiserX.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null},"funding":{"github":"geiserx","patreon":"geiser","buy_me_a_coffee":"geiser","thanks_dev":"u/gh/geiserx"}},"created_at":"2025-12-05T21:24:26.000Z","updated_at":"2026-04-25T12:32:51.000Z","dependencies_parsed_at":null,"dependency_job_id":"46b6a922-fd1e-4f62-bb7b-7842a9b2bc71","html_url":"https://github.com/GeiserX/whisper-subs","commit_stats":null,"previous_names":["geiserx/jelly-subtitles","geiserx/whisper-subs"],"tags_count":46,"template":false,"template_full_name":null,"purl":"pkg:github/GeiserX/whisper-subs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeiserX%2Fwhisper-subs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeiserX%2Fwhisper-subs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeiserX%2Fwhisper-subs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeiserX%2Fwhisper-subs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GeiserX","download_url":"https://codeload.github.com/GeiserX/whisper-subs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GeiserX%2Fwhisper-subs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32614385,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"ssl_error","status_checked_at":"2026-05-04T10:08:02.005Z","response_time":58,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accessibility","ai","automation","csharp","dotnet","hacktoberfest","homelab","jellyfin","jellyfin-plugin","media-management","media-server","open-source","parakeet","self-hosted","speech-to-text","stt","subtitles","transcription","whisper","whisper-cpp"],"created_at":"2026-04-18T13:00:27.678Z","updated_at":"2026-05-04T16:01:04.725Z","avatar_url":"https://github.com/GeiserX.png","language":"C#","funding_links":["https://github.com/sponsors/geiserx","https://patreon.com/geiser","https://buymeacoffee.com/geiser","https://thanks.dev/u/gh/geiserx"],"categories":["🧩 Plugins","Applications","Subtitle and Localization"],"sub_categories":["📚 Library Management"],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/images/banner.svg\" alt=\"WhisperSubs Banner\" width=\"900\"/\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/GeiserX/whisper-subs/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/GeiserX/whisper-subs?style=flat-square\u0026logo=github\u0026color=6B4C9A\" alt=\"Release\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/GeiserX/whisper-subs/actions/workflows/build-release.yml\"\u003e\u003cimg src=\"https://img.shields.io/github/actions/workflow/status/GeiserX/whisper-subs/build-release.yml?branch=main\u0026style=flat-square\u0026label=tests\" alt=\"Tests\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/GeiserX/whisper-subs/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-GPL--3.0-blue?style=flat-square\" alt=\"License\"\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/.NET-9.0-512BD4?style=flat-square\u0026logo=dotnet\u0026logoColor=white\" alt=\".NET 9.0\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Jellyfin-10.11%2B-6B4C9A?style=flat-square\" alt=\"Jellyfin 10.11+\"\u003e\n  \u003ca href=\"https://github.com/awesome-jellyfin/awesome-jellyfin#readme\"\u003e\u003cimg src=\"https://img.shields.io/badge/listed%20on-awesome--jellyfin-00a4dc?style=flat-square\u0026logo=jellyfin\u0026logoColor=white\" alt=\"listed on awesome-jellyfin\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://codecov.io/gh/GeiserX/whisper-subs\"\u003e\u003cimg src=\"https://codecov.io/gh/GeiserX/whisper-subs/graph/badge.svg\" alt=\"codecov\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n**WhisperSubs** is a Jellyfin plugin that automatically generates subtitles for your media library using local AI models. All transcription runs entirely on your server -- no audio data ever leaves your network. Your media stays private.\n\n## Features\n\n- **Fully Local Processing** -- Audio is transcribed on your hardware using [whisper.cpp](https://github.com/ggerganov/whisper.cpp). No cloud APIs, no external services, no data exfiltration.\n- **Built-in Engine Setup** -- Download whisper-cli binaries and models directly from the plugin settings page on Linux. No manual installation needed for most users.\n- **Automatic Language Detection** -- Reads audio stream metadata to detect the spoken language and generate matching subtitles. Falls back to whisper's built-in language detection when tags are absent.\n- **Forced Subtitles** -- Detect and transcribe only foreign-language dialogue (e.g., French lines in an English movie) via VAD-based speech segmentation and per-chunk language detection.\n- **Lyrics Generation (Experimental)** -- Generate `.lrc` lyrics files for music libraries via whisper transcription. Jellyfin picks up `.lrc` files automatically.\n- **GPU Acceleration** -- Supports CUDA (NVIDIA), Vulkan (Intel / AMD / NVIDIA), and ROCm (AMD) for significantly faster transcription.\n- **Priority Queue** -- Manual requests are queued with priority and processed before scheduled items. Queue persists across restarts.\n- **Real-time Progress** -- Live progress banner in the admin UI showing current item, phase (extracting audio, transcribing), per-file progress, and overall stats.\n- **Subtitle Resume** -- If transcription is interrupted, it resumes from the last timestamp rather than starting over.\n- **Admin Dashboard UI** -- Browse libraries, view items, manage the whisper engine, and trigger subtitle generation directly from the Jellyfin admin panel.\n- **Scheduled Tasks** -- Enable automatic scanning so new media gets subtitles without manual intervention. Runs daily at 2:00 AM and on startup by default.\n- **Per-Library Control** -- Choose which libraries are monitored for automatic subtitle generation.\n- **Multiple Output Formats** -- Generates `.srt` subtitles, `.forced.generated.srt` forced subtitles, and `.lrc` lyrics, all placed alongside your media and auto-detected by Jellyfin.\n\n## Prerequisites\n\n| Dependency | Details |\n|---|---|\n| **Jellyfin** | 10.11.0 or later |\n| **FFmpeg** | Bundled with Jellyfin (`/usr/lib/jellyfin-ffmpeg/ffmpeg`) or available in `PATH`. Used to extract audio from media files. |\n| **whisper.cpp** | The `whisper-cli` binary. **On Linux, the plugin can download this automatically** from the settings page. Otherwise, install manually -- see [Installing whisper.cpp](#installing-whispercpp). |\n| **Whisper Model** | A GGML model file. **The plugin can download models automatically** from the settings page, or download manually from [Hugging Face](https://huggingface.co/ggerganov/whisper.cpp). |\n\n\u003e **Quick start (Linux):** After installing the plugin, go to **Dashboard** \u003e **Plugins** \u003e **WhisperSubs**. The **Whisper Engine** section lets you download both the binary and a model with one click each. The manual steps below are only needed for non-Linux platforms or custom setups.\n\n## Installation\n\n### From the Jellyfin Plugin Repository (Recommended)\n\n1. In Jellyfin, go to **Dashboard** \u003e **Plugins** \u003e **Repositories**.\n2. Add a new repository with this URL:\n   ```\n   https://geiserx.github.io/whisper-subs/manifest.json\n   ```\n3. Go to **Catalog**, find **WhisperSubs**, and click **Install**.\n4. Restart Jellyfin.\n\n### Manual Installation\n\n1. Build from source:\n   ```bash\n   dotnet build --configuration Release\n   ```\n2. Copy `WhisperSubs.dll` to your Jellyfin plugins directory:\n   ```\n   /var/lib/jellyfin/plugins/WhisperSubs/\n   ```\n3. Restart Jellyfin.\n\n## Installing whisper.cpp\n\nThe plugin requires whisper.cpp for transcription. Choose the method that matches your setup.\n\n### Option A: Pre-built Binary (Recommended for most users)\n\n1. Download the latest release for your platform from [whisper.cpp releases](https://github.com/ggerganov/whisper.cpp/releases).\n2. Extract and place the `whisper-cli` binary somewhere persistent (e.g., `/opt/whisper/`).\n3. Download a model:\n   ```bash\n   mkdir -p /opt/whisper/models\n\n   # Base model (~148 MB) -- fast, good for quick transcription\n   wget -O /opt/whisper/models/ggml-base.bin \\\n     https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin\n\n   # Large V3 Turbo (~1.6 GB) -- best accuracy with reasonable speed (recommended)\n   wget -O /opt/whisper/models/ggml-large-v3-turbo.bin \\\n     https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin\n   ```\n4. In the plugin settings, set **Whisper Binary Path** to `/opt/whisper/whisper-cli` and **Whisper Model Path** to the model file.\n\n### Option B: Build from Source (CPU only)\n\n```bash\ngit clone https://github.com/ggerganov/whisper.cpp.git\ncd whisper.cpp\ncmake -B build -DBUILD_SHARED_LIBS=OFF\ncmake --build build --config Release -j$(nproc)\n# Binary will be at build/bin/whisper-cli\n```\n\n### Option C: Build from Source with GPU Acceleration\n\nSee [GPU Acceleration](#gpu-acceleration) below for detailed instructions.\n\n### Docker / Container Setups\n\nIf Jellyfin runs in a Docker container, whisper.cpp must be accessible **inside** the container. The recommended approach is to bind-mount a host directory containing the binary and model:\n\n```yaml\n# docker-compose.yml\nservices:\n  jellyfin:\n    image: jellyfin/jellyfin\n    volumes:\n      - /opt/whisper:/opt/whisper:ro   # whisper-cli binary + models\n      # ... your other volumes\n```\n\nThen configure the plugin with:\n- **Whisper Binary Path**: `/opt/whisper/whisper-cli`\n- **Whisper Model Path**: `/opt/whisper/models/ggml-large-v3-turbo.bin`\n\n\u003e **Note:** The binary must be compiled for the same architecture as the container (typically x86_64 Linux). Download the `linux-x64` release asset or build inside a matching environment.\n\n#### \u003ca id=\"container-setup\"\u003e\u003c/a\u003eContainer Library Requirements\n\nThe plugin's built-in binary downloader fetches pre-built whisper-cli binaries. These require runtime libraries that are **not included** in the default Jellyfin Docker image:\n\n| Variant | Required packages | Install command |\n|---------|-------------------|-----------------|\n| **CPU** | `libgomp1` | `apt install libgomp1` |\n| **Vulkan** | `libgomp1`, `libvulkan1`, `mesa-vulkan-drivers` | `apt install libgomp1 libvulkan1 mesa-vulkan-drivers` |\n| **CUDA 12** | `libgomp1` + NVIDIA Container Toolkit on host | See [CUDA section](#cuda-nvidia) |\n| **ROCm** | `libgomp1` + ROCm runtime | See [ROCm docs](https://rocm.docs.amd.com/) |\n\n\u003e `libgomp1` (OpenMP threading) is required by **all** variants, including CPU.\n\nTo install persistently, add to your container's entrypoint or Dockerfile:\n\n```bash\napt-get update -qq \u0026\u0026 apt-get install -y -qq --no-install-recommends libgomp1 \u0026\u0026 rm -rf /var/lib/apt/lists/*\n```\n\nThe plugin's setup page will detect missing libraries and warn you before downloading.\n\n### Verifying the Installation\n\n```bash\n# If in PATH:\nwhisper-cli --help\n\n# If using an absolute path:\n/opt/whisper/whisper-cli --help\n\n# Inside a Docker container:\ndocker exec jellyfin /opt/whisper/whisper-cli --help\n```\n\n## GPU Acceleration\n\nwhisper.cpp supports GPU offloading via **Vulkan** (Intel, AMD, and some NVIDIA GPUs), **CUDA** (NVIDIA), and **ROCm** (AMD). GPU acceleration dramatically reduces transcription time, especially with larger models.\n\n\u003e **Docker users:** Passing the GPU device (e.g., `/dev/dri`) to a container is **not enough** -- the container also needs the matching userspace libraries installed. The auto-setup wizard detects both the device and the library and will fall back to CPU if the library is missing.\n\u003e\n\u003e | Backend | Device | Required library | Install command (Debian/Ubuntu) |\n\u003e |---------|--------|------------------|---------------------------------|\n\u003e | **CUDA** | `/dev/nvidia0` | `libcuda.so.1` | `nvidia-container-toolkit` (host) |\n\u003e | **Vulkan** | `/dev/dri` | `libvulkan.so.1` + ICD JSON | `apt install libvulkan1 mesa-vulkan-drivers` (also needs `/usr/share/vulkan/icd.d/*.json`) |\n\u003e | **ROCm** | `/dev/kfd` | `libamdhip64.so` | `apt install rocm-hip-runtime` |\n\n### Vulkan (Intel / AMD)\n\nVulkan is the best option for Intel iGPUs (e.g., UHD 770) and AMD GPUs. It works through the Mesa Vulkan drivers.\n\n#### Building whisper.cpp with Vulkan\n\n```bash\ngit clone https://github.com/ggerganov/whisper.cpp.git\ncd whisper.cpp\ncmake -B build \\\n  -DGGML_VULKAN=ON \\\n  -DBUILD_SHARED_LIBS=OFF\ncmake --build build --config Release -j$(nproc)\n# Binary: build/bin/whisper-cli\n```\n\n\u003e **Important:** The CMake flag is `-DGGML_VULKAN=ON` (not `-DWHISPER_VULKAN`). This is a common source of confusion.\n\n#### Runtime Dependencies\n\nThe Vulkan binary requires these libraries at runtime:\n\n| Package (Debian/Ubuntu) | Purpose |\n|---|---|\n| `libvulkan1` | Vulkan loader |\n| `mesa-vulkan-drivers` | Intel (ANV) and AMD (RADV) Vulkan ICDs |\n| `libgomp1` | OpenMP threading |\n\n```bash\napt-get install -y libvulkan1 mesa-vulkan-drivers libgomp1\n```\n\n#### Docker: GPU Passthrough for Vulkan\n\nTo use an Intel or AMD GPU inside a Docker container:\n\n```yaml\nservices:\n  jellyfin:\n    image: jellyfin/jellyfin\n    devices:\n      - /dev/dri:/dev/dri    # GPU render nodes\n    volumes:\n      - /opt/whisper:/opt/whisper:ro\n```\n\nThe container also needs the Vulkan runtime libraries. If using the official Jellyfin image (Debian-based), install them on startup:\n\n```yaml\n    entrypoint:\n      - /bin/bash\n      - -c\n      - |\n        dpkg -s libvulkan1 \u003e /dev/null 2\u003e\u00261 || \\\n          (apt-get update -qq \u0026\u0026 \\\n           apt-get install -y -qq --no-install-recommends \\\n             libvulkan1 mesa-vulkan-drivers libgomp1 \u003e /dev/null 2\u003e\u00261 \u0026\u0026 \\\n           rm -rf /var/lib/apt/lists/*)\n        exec /jellyfin/jellyfin\n```\n\nVerify GPU detection inside the container:\n\n```bash\n# Should show your GPU (e.g., \"Intel(R) UHD Graphics 770\")\ndocker exec jellyfin apt-get update -qq \u0026\u0026 \\\n  docker exec jellyfin apt-get install -y -qq vulkan-tools \u0026\u0026 \\\n  docker exec jellyfin vulkaninfo --summary\n```\n\n#### Building Inside Docker (ABI Compatibility)\n\nWhen Jellyfin runs in a container, the whisper binary must be compiled against matching system libraries. Build inside a container with the same base image:\n\n```bash\n# On the Docker host:\ndocker run --rm -v /opt/whisper:/output debian:trixie bash -c '\n  apt-get update \u0026\u0026 apt-get install -y git cmake g++ libvulkan-dev \u0026\u0026\n  git clone https://github.com/ggerganov/whisper.cpp.git /tmp/whisper \u0026\u0026\n  cd /tmp/whisper \u0026\u0026\n  cmake -B build -DGGML_VULKAN=ON -DBUILD_SHARED_LIBS=OFF \u0026\u0026\n  cmake --build build --config Release -j$(nproc) \u0026\u0026\n  cp build/bin/whisper-cli /output/whisper-cli\n'\n```\n\n### CUDA (NVIDIA)\n\nFor NVIDIA GPUs with CUDA support:\n\n#### Building whisper.cpp with CUDA\n\n```bash\ngit clone https://github.com/ggerganov/whisper.cpp.git\ncd whisper.cpp\ncmake -B build \\\n  -DGGML_CUDA=ON \\\n  -DBUILD_SHARED_LIBS=OFF\ncmake --build build --config Release -j$(nproc)\n```\n\n#### Docker: NVIDIA GPU Passthrough\n\n```yaml\nservices:\n  jellyfin:\n    image: jellyfin/jellyfin\n    runtime: nvidia\n    environment:\n      - NVIDIA_VISIBLE_DEVICES=all\n    deploy:\n      resources:\n        reservations:\n          devices:\n            - driver: nvidia\n              count: all\n              capabilities: [gpu]\n    volumes:\n      - /opt/whisper:/opt/whisper:ro\n```\n\n\u003e Requires the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html).\n\n### Verifying GPU Acceleration\n\nAfter configuring GPU support, trigger a transcription and check the Jellyfin logs. You should see:\n\n```\n# Vulkan\nwhisper_backend_init_gpu: using Vulkan0 backend\n\n# CUDA\nwhisper_backend_init_gpu: using CUDA0 backend\n```\n\nIf you see `no GPU found` or `using CPU backend`, the binary was not built with GPU support or the runtime drivers are missing.\n\n### Model Recommendations\n\n| Model | Size | Speed (CPU) | Speed (GPU) | Quality | Use Case |\n|---|---|---|---|---|---|\n| `ggml-large-v3-turbo-q5_0.bin` | 574 MB | Moderate | Fast | Excellent | **Recommended.** Best quality/size ratio. |\n| `ggml-large-v3-turbo.bin` | 1.6 GB | Slow | Fast | Excellent | Full-precision turbo. Slightly better quality, 3x larger. |\n| `ggml-medium-q5_0.bin` | 539 MB | Moderate | Fast | Very good | Similar size to turbo-q5 but slower and less accurate. |\n| `ggml-medium.bin` | 1.5 GB | Moderate | Fast | Very good | Full-precision medium model. |\n| `ggml-small.bin` | 488 MB | Fast | Very fast | Good | Faster inference, lower accuracy. |\n| `ggml-base.bin` | 148 MB | Fast | Very fast | Fair | Lightweight. Fast but noticeably less accurate. |\n| `ggml-tiny.bin` | 78 MB | Very fast | Very fast | Basic | Smallest model. Only for testing or constrained environments. |\n\nThe Q5 quantized models offer nearly identical quality to their F16 counterparts at a fraction of the size. `ggml-large-v3-turbo-q5_0` is the default when downloading from the plugin settings page.\n\n## Configuration\n\nAfter installation, navigate to **Dashboard** \u003e **Plugins** \u003e **WhisperSubs** to configure:\n\n| Setting | Description |\n|---|---|\n| **Default Language** | `Auto-detect` reads the language from each file's audio stream metadata and generates matching subtitles. Choose a specific language to force it for all transcriptions. |\n| **Subtitle Mode** | Full, Forced Only, or Full + Forced. See [Subtitle Modes](#subtitle-modes) below. |\n| **Enable Auto-Generation** | When enabled, the scheduled task will scan selected libraries and generate subtitles for items that lack them. |\n| **Enabled Libraries** | Select which libraries should be monitored for automatic subtitle generation. |\n| **Enable Lyrics Generation** | When enabled, music libraries are scanned and audio tracks receive `.lrc` lyrics files (experimental -- whisper is optimized for speech, not singing). |\n| **Whisper Binary Path** | *(Advanced)* Absolute path to the `whisper-cli` binary. Leave empty to use the auto-downloaded binary or search `PATH`. |\n| **Whisper Model Path** | *(Advanced)* Absolute path to the GGML model file. Leave empty to use the auto-downloaded model. |\n| **Whisper Thread Count** | *(Advanced)* Number of CPU threads for whisper inference. `0` = whisper default (4). Set to your CPU core count for faster transcription. |\n\n### Subtitle Modes\n\n| Mode | What it generates | Performance |\n|---|---|---|\n| **Full** (default) | Complete transcription of all speech | Fast -- single whisper run per audio track |\n| **Forced Only** | Only foreign-language dialogue (e.g., French lines in an English movie) | Slow -- see below |\n| **Full + Forced** | Both files per track | Slowest -- runs both pipelines |\n\n\u003e **Performance warning for Forced / Full + Forced modes:**\n\u003e Forced subtitle generation uses a multi-step pipeline: audio extraction, VAD-based speech segmentation, then **per-chunk language detection** on every ~30-second segment of the movie. For a 2-hour film this means ~240 individual whisper calls just for detection, before any transcription begins. On CPU, this phase alone can take **10--20+ minutes per movie**. GPU acceleration helps significantly.\n\u003e\n\u003e If you don't need forced subtitles (most users don't), use **Full** mode for much faster processing.\n\n### Language Handling\n\nThe plugin supports three language modes:\n\n1. **Auto-detect (recommended)** -- The plugin uses FFprobe to read the audio stream's language tag (e.g., `spa` → `es`, `eng` → `en`). Subtitles are generated in the language that matches the audio. If a file has multiple audio tracks in different languages, subtitles are generated for each one.\n\n2. **Whisper auto-detection** -- When no language metadata is available, the request falls through to whisper's built-in language detection (`-l auto`), which analyzes the first 30 seconds of audio.\n\n3. **Forced language** -- Set a specific language code (e.g., `es`) in the configuration or per-request via the API. This overrides detection and tells whisper to transcribe using that language model.\n\n## Usage\n\n### Admin Dashboard\n\nThe plugin adds a dedicated page to the Jellyfin admin dashboard (accessible from **Dashboard** \u003e **Plugins** \u003e **WhisperSubs**, or from the main sidebar menu). From there you can:\n\n- **Configure** the plugin settings (language, subtitle mode, binary/model paths, enabled libraries).\n- **Manage the whisper engine** -- download binaries (CPU / Vulkan / CUDA / ROCm) and models directly from the UI.\n- **Browse** all libraries and their items.\n- **See** which items already have subtitles (green check / orange cross).\n- **Select a language** for subtitle generation (auto-detect or any specific language).\n- **Generate** subtitles for individual items with a single click.\n- **Monitor progress** -- a live banner shows the current item, processing phase, and queue depth.\n\n### REST API\n\nAll endpoints require Jellyfin admin authentication. Setup endpoints additionally require elevated privileges.\n\n**Library \u0026 Items**\n\n| Method | Endpoint | Description |\n|---|---|---|\n| `GET` | `/Plugins/WhisperSubs/Libraries` | List all media libraries |\n| `GET` | `/Plugins/WhisperSubs/Libraries/{libraryId}/Items` | List items in a library (supports `startIndex` and `limit`) |\n| `POST` | `/Plugins/WhisperSubs/Items/{itemId}/Generate?language=auto` | Queue subtitle generation (priority) |\n| `GET` | `/Plugins/WhisperSubs/Items/{itemId}/AudioLanguages` | Detect audio languages in a media file |\n| `GET` | `/Plugins/WhisperSubs/Items/{itemId}/Status` | Check subtitle generation status |\n\n**Queue \u0026 Task**\n\n| Method | Endpoint | Description |\n|---|---|---|\n| `GET` | `/Plugins/WhisperSubs/Queue` | Queue status: current item, progress, phase, remaining count |\n| `POST` | `/Plugins/WhisperSubs/RunTask` | Trigger the scheduled subtitle generation task |\n| `GET` | `/Plugins/WhisperSubs/Models` | List downloaded models with active/size info |\n\n**Engine Setup** (requires elevated privileges)\n\n| Method | Endpoint | Description |\n|---|---|---|\n| `GET` | `/Plugins/WhisperSubs/Setup/Status` | Binary/model status, GPU detection, platform info |\n| `GET` | `/Plugins/WhisperSubs/Setup/BinaryVariants` | Available binary variants for this platform |\n| `POST` | `/Plugins/WhisperSubs/Setup/DownloadBinary?variant=cpu` | Download whisper-cli binary |\n| `GET` | `/Plugins/WhisperSubs/Setup/AvailableModels` | Model catalog with sizes and descriptions |\n| `POST` | `/Plugins/WhisperSubs/Setup/DownloadModel?name=...` | Download a model from HuggingFace |\n| `GET` | `/Plugins/WhisperSubs/Setup/Progress` | Download progress (percent, message, errors) |\n| `POST` | `/Plugins/WhisperSubs/Setup/Models/{filename}/Activate` | Set a downloaded model as active |\n| `DELETE` | `/Plugins/WhisperSubs/Setup/Models/{filename}` | Delete a downloaded model |\n\nThe `language` parameter accepts `auto` (default), or any ISO 639-1 code (`en`, `es`, `fr`, etc.).\n\n### Scheduled Task\n\nA scheduled task named **Generate Subtitles** is registered under the **WhisperSubs** category. It can be configured in **Dashboard** \u003e **Scheduled Tasks** with your preferred schedule or triggered manually. The task:\n\n1. Scans all enabled libraries (or all libraries if none are explicitly selected).\n2. Finds video items that lack subtitles.\n3. Generates subtitles using the configured default language (auto-detect by default).\n4. Reports progress in the Jellyfin task UI.\n\n## How It Works\n\n1. **Language Detection** -- FFprobe reads the audio stream metadata to determine the spoken language(s).\n2. **Audio Extraction** -- FFmpeg extracts a 16 kHz mono WAV track from the media file.\n3. **Transcription** -- The extracted audio is passed to whisper.cpp, which produces an SRT subtitle file. For forced subtitles, the audio is first segmented via VAD (silence detection), then each ~30-second chunk is language-classified before selectively transcribing only foreign-language segments.\n4. **Output** -- Files are saved alongside the original media:\n   - Full subtitles: `Movie.es.generated.srt`\n   - Forced subtitles: `Movie.es.forced.generated.srt`\n   - Lyrics: `Song.lrc`\n5. **Metadata Refresh** -- The item's metadata is refreshed so Jellyfin picks up the new files immediately.\n\nTemporary audio files are cleaned up automatically after processing. Items that have already been processed are tracked with marker files (`.noforeignlang`) to avoid redundant work on subsequent scans.\n\n## Roadmap\n\nSee [ROADMAP.md](ROADMAP.md) for planned features and design details.\n\n## Other Jellyfin Projects by GeiserX\n\n- [smart-covers](https://github.com/GeiserX/smart-covers) — Cover extraction for books, audiobooks, comics, magazines, and music libraries with online fallback\n- [quality-gate](https://github.com/GeiserX/quality-gate) — Restrict users to specific media versions based on configurable path-based policies\n- [jellyfin-encoder](https://github.com/GeiserX/jellyfin-encoder) — Automatic 720p HEVC/AV1 transcoding service with hardware acceleration\n- [jellyfin-telegram-channel-sync](https://github.com/GeiserX/jellyfin-telegram-channel-sync) — Sync Jellyfin access with Telegram channel membership\n\n\n## License\n\nThis project is licensed under the **GNU General Public License v3.0**. See the [LICENSE](LICENSE) file for the full text.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGeiserX%2Fwhisper-subs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FGeiserX%2Fwhisper-subs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGeiserX%2Fwhisper-subs/lists"}