{"id":50338820,"url":"https://github.com/endo-ly/voice-gateway","last_synced_at":"2026-05-29T15:30:36.416Z","repository":{"id":353911318,"uuid":"1221366150","full_name":"endo-ly/voice-gateway","owner":"endo-ly","description":"OpenAI-compatible TTS adapter server for stable local voice workflows.","archived":false,"fork":false,"pushed_at":"2026-05-27T12:23:03.000Z","size":358,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-27T13:25:58.270Z","etag":null,"topics":["fastapi","irodori-tts","openai-compatible","tts"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/endo-ly.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-26T05:21:06.000Z","updated_at":"2026-05-27T12:23:48.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/endo-ly/voice-gateway","commit_stats":null,"previous_names":["endo-ly/tts-adapter","endo-ly/voice-gateway"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/endo-ly/voice-gateway","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endo-ly%2Fvoice-gateway","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endo-ly%2Fvoice-gateway/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endo-ly%2Fvoice-gateway/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endo-ly%2Fvoice-gateway/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/endo-ly","download_url":"https://codeload.github.com/endo-ly/voice-gateway/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/endo-ly%2Fvoice-gateway/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33659872,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","irodori-tts","openai-compatible","tts"],"created_at":"2026-05-29T15:30:35.526Z","updated_at":"2026-05-29T15:30:36.396Z","avatar_url":"https://github.com/endo-ly.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# voice-gateway\n\n複数のTTS・STTエンジンを、OpenAI互換APIで統一的に扱うゲートウェイサーバー。\n\n音声エンジンごとに異なるAPI形式や設定方法を吸収し、クライアント側はエンジンを意識せずに音声の入出力を行える。エージェントシステムや外部ツールから、同じAPIでTTS・STTを利用したい場合に使う。\n\n## 特徴\n\n- **Provider抽象化** — TTS・STTエンジンをAPIを変えずに差し替え可能\n- **OpenAI互換API** — `/v1/audio/speech`, `/v1/audio/transcriptions` でOpenAIクライアントと互換\n- **Native API** — 拡張パラメータを使える独自エンドポイント\n- **YAMLプロファイル** — モデル・音声の設定をYAMLで管理\n- **モード切替** — 1コードベースでTTS専用・STT専用・両対応を切り替え\n\n## サーバーモード\n\n`VOICE_GATEWAY_MODE` により起動する機能を切り替える。異なるマシンで同じコードベースを使い分けられる。\n\n| モード | 登録されるルート | ユースケース |\n|------|-------------|------------|\n| `tts` | TTS系 + 共通 | GPU搭載WindowsマシンでIrodori等を動かす |\n| `stt` | STT系 + 共通 | 軽量ミニPCでReazonSpeech等を動かす |\n| `all` | 全ルート | 1台でTTS・STT両方を動かす |\n\n## クイックスタート\n\n### 1. インストール\n\n```bash\ngit clone https://github.com/endo-ly/voice-gateway.git \u0026\u0026 cd voice-gateway\nuv sync --group dev\n\n# ReazonSpeech K2 (STT) を含む場合:\nuv sync --group dev --extra reazonspeech-k2\n./scripts/install-reazonspeech-k2.sh\n```\n\n### 2. 設定\n\nテンプレートから設定ファイルを作成:\n\n```bash\ncp assets/models/models.example.yaml assets/models/models.yaml\ncp assets/voices/your-voice-name/profile.example.yaml assets/voices/your-voice-name/profile.yaml\n```\n\n環境変数を設定（`.env` ファイルも可）:\n\n```bash\n# モード（デフォルト: all）\nexport VOICE_GATEWAY_MODE=all\n\n# Irodori-TTS（TTS利用時）\nexport IRODORI_REPO_DIR=/path/to/Irodori-TTS\n\n# AivisSpeech（voice-gatewayからEngineも起動する場合）\nexport AIVIS_MANAGE_ENGINE=true\nexport AIVIS_ENGINE_DIR=.vendor/AivisSpeech-Engine\n```\n\n### 3. 起動\n\n構成に応じて `--host` を使い分ける:\n\n```bash\n# 同じマシン内からのみアクセス（開発・ローカル利用）\nuv run uvicorn app.main:app --host 127.0.0.1 --port 8012\n\n# 別マシンからアクセス（全インターフェースにバインド）\nuv run uvicorn app.main:app --host 0.0.0.0 --port 8012\n\n# 別マシンからアクセス（特定のインターフェースにバインド）\nuv run uvicorn app.main:app --host 192.168.0.86 --port 8012\n```\n\n### 4. 動作確認\n\n```bash\ncurl http://127.0.0.1:8012/health\n# → {\"status\":\"ok\",\"providers\":{\"tts\":{\"registered\":[\"irodori\"],\"loaded\":[\"irodori\"]},...}}\n```\n\n## 使い方\n\n### TTS（音声合成）\n\n**OpenAI互換:**\n\n```bash\ncurl -X POST http://127.0.0.1:8012/v1/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"tts-default\",\"voice\":\"your-voice-name\",\"input\":\"こんにちは\"}' \\\n  --output output.wav\n```\n\n**Native（拡張パラメータ）:**\n\n```bash\ncurl -X POST http://127.0.0.1:8012/v1/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"tts-default\",\"voice_id\":\"your-voice-name\",\"speech_text\":\"こんにちは\"}' \\\n  --output output.wav\n```\n\n### STT（音声認識）\n\n**OpenAI互換:**\n\n```bash\ncurl -X POST http://127.0.0.1:8012/v1/audio/transcriptions \\\n  -F \"file=@audio.wav\" \\\n  -F \"model=stt-default\"\n# → {\"text\":\"転写されたテキスト\"}\n```\n\n**Native（拡張パラメータ）:**\n\n```bash\ncurl -X POST http://127.0.0.1:8012/v1/transcribe \\\n  -F \"file=@audio.wav\" \\\n  -F \"model=stt-default\" \\\n  -F \"source=stackchan\"\n```\n\n**直近の転写結果:**\n\n```bash\ncurl http://127.0.0.1:8012/v1/transcribe/latest\n```\n\n### サーバー情報\n\n```bash\n# モード・Provider・Model一覧\ncurl http://127.0.0.1:8012/v1/capabilities\n\n# Model一覧\ncurl http://127.0.0.1:8012/v1/models\n\n# Voice一覧（tts/allモードのみ）\ncurl http://127.0.0.1:8012/v1/voices\n```\n\n## 環境変数\n\n### 共通\n\n| 変数 | デフォルト | 説明 |\n|----------|---------|------|\n| `VOICE_GATEWAY_MODE` | `all` | サーバーモード: `tts`, `stt`, `all` |\n| `HOST` | `127.0.0.1` | 待受ホスト |\n| `PORT` | `8012` | 待受ポート |\n| `LOG_LEVEL` | `INFO` | ログレベル |\n| `TIMEOUT_SEC` | `120` | Provider実行タイムアウト（秒） |\n| `MAX_CONCURRENCY` | `1` | 同時実行数上限 |\n\n### TTS\n\n| 変数 | デフォルト | 説明 |\n|----------|---------|------|\n| `IRODORI_REPO_DIR` | — | Irodori-TTSインストールパス（Irodori利用時必須） |\n| `AIVIS_BASE_URL` | `http://127.0.0.1:10101` | AivisSpeech EngineのURL |\n| `AIVIS_MANAGE_ENGINE` | `false` | `true` の場合、voice-gateway起動時にAivisSpeech Engineも起動する |\n| `AIVIS_ENGINE_DIR` | `.vendor/AivisSpeech-Engine` | 管理起動するAivisSpeech Engineのディレクトリ |\n| `AIVIS_ENGINE_BIND_HOST` | — | Engine起動時のバインドホスト（未設定時は`AIVIS_BASE_URL`から抽出） |\n| `AIVIS_ENGINE_PORT` | — | Engine起動時のポート（未設定時は`AIVIS_BASE_URL`から抽出） |\n| `AIVIS_USE_GPU` | `false` | `true` の場合、Engine起動時に`--use_gpu`を使用する |\n| `AIVIS_STARTUP_TIMEOUT_SEC` | `180` | AivisSpeech Engine起動待ちタイムアウト（秒） |\n\n### STT\n\n| 変数 | デフォルト | 説明 |\n|----------|---------|------|\n| `REAZONSPEECH_REPO_DIR` | `.vendor/ReazonSpeech` | ReazonSpeech リポジトリのclone先ルートパス（install script用） |\n| `STT_CALLBACK_URL` | — | 転写完了時のコールバックURL |\n| `STT_CALLBACK_TIMEOUT_MS` | `3000` | コールバックタイムアウト（ms） |\n\n## APIエンドポイント\n\n### 共通（全モード）\n\n| メソッド | パス | 説明 |\n|--------|----------|------|\n| GET | `/health` | 死活監視 + Provider状態 |\n| GET | `/v1/models` | Model一覧 |\n| GET | `/v1/capabilities` | サーバー機能情報 |\n\n### TTS（tts / all）\n\n| メソッド | パス | 説明 |\n|--------|----------|------|\n| GET | `/v1/voices` | Voice一覧 |\n| POST | `/v1/audio/speech` | OpenAI互換TTS |\n| POST | `/v1/speech` | Native TTS |\n\n### STT（stt / all）\n\n| メソッド | パス | 説明 |\n|--------|----------|------|\n| POST | `/v1/audio/transcriptions` | OpenAI互換STT |\n| POST | `/v1/transcribe` | Native STT |\n| GET | `/v1/transcribe/latest` | 直近の転写結果 |\n\n## Provider対応\n\n| Provider | 方向 | 呼び出し方式 | 動作環境 |\n|----------|------|------------|---------|\n| [Irodori-TTS](docs/providers/irodori.md) | TTS | CLI subprocess | Windows / Linux + GPU推奨 |\n| [AivisSpeech Engine](docs/providers/aivis-speech.md) | TTS | HTTP API / managed process | managed: Linux / external: Linux / Windows |\n| [ReazonSpeech K2](docs/providers/reazonspeech-k2.md) | STT | Python import | Linux |\n\n## ドキュメント\n\n| ドキュメント | 内容 |\n|----------|------|\n| [コンセプト](docs/CONCEPT.md) | 設計思想と使う理由 |\n| [APIリファレンス](docs/api-reference.md) | 全エンドポイントの仕様 |\n| [設定ガイド](docs/configuration.md) | 環境変数とYAMLプロファイル |\n| [アーキテクチャ](docs/architecture.md) | 層構造とデータフロー |\n| [拡張ガイド](docs/extension-guide.md) | Provider / Voice / Modelの追加手順 |\n| [開発ガイド](docs/development.md) | 環境構築、テスト、プロジェクト構成 |\n\n## License\n\n[MIT](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fendo-ly%2Fvoice-gateway","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fendo-ly%2Fvoice-gateway","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fendo-ly%2Fvoice-gateway/lists"}