{"id":30892299,"url":"https://github.com/evilfreelancer/docker-fish-speech-server","last_synced_at":"2025-09-08T19:43:36.199Z","repository":{"id":288125250,"uuid":"965089129","full_name":"EvilFreelancer/docker-fish-speech-server","owner":"EvilFreelancer","description":"OpenAPI-like API-server for voice generation (TTS) based on fish-speech-1.5 model.","archived":false,"fork":false,"pushed_at":"2025-05-24T16:27:41.000Z","size":8065,"stargazers_count":22,"open_issues_count":2,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-07-30T09:50:39.518Z","etag":null,"topics":["api","docker","fish-speech","openai-api","text-to-speech","tts"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EvilFreelancer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-12T11:41:29.000Z","updated_at":"2025-07-28T09:30:27.000Z","dependencies_parsed_at":"2025-04-15T17:47:02.037Z","dependency_job_id":"0678e88c-53e1-4538-a162-3c3fe2205bed","html_url":"https://github.com/EvilFreelancer/docker-fish-speech-server","commit_stats":null,"previous_names":["evilfreelancer/docker-fish-speech-server"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/EvilFreelancer/docker-fish-speech-server","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-fish-speech-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-fish-speech-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-fish-speech-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-fish-speech-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EvilFreelancer","download_url":"https://codeload.github.com/EvilFreelancer/docker-fish-speech-server/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EvilFreelancer%2Fdocker-fish-speech-server/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274231433,"owners_count":25245585,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-08T02:00:09.813Z","response_time":121,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","docker","fish-speech","openai-api","text-to-speech","tts"],"created_at":"2025-09-08T19:43:34.673Z","updated_at":"2025-09-08T19:43:36.172Z","avatar_url":"https://github.com/EvilFreelancer.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Fish Speech API Webserver in Docker\n\nOpenAPI-like voice generation server based on [fish-speech-1.5](https://huggingface.co/fishaudio/fish-speech-1.5).\n\nSupports `text-to-speech` and voice style transfer via reference audio samples.\n\n## Requirements\n\n* Nvidia GPU\n* For Docker-way\n    * Nvidia Docker Runtime\n    * Docker\n    * Docker Compose\n* For Manual Setup\n    * Python 3.12\n    * Python Venv\n\n## 🔧 Quick Start\n\nClone the repo first:\n\n```shell\ngit clone --recurse-submodules git@github.com:EvilFreelancer/fish-speech-api.git\ncd docker-fish-speech-server\n```\n\n### Docker-way\n\n```shell\ncp docker-compose.dist.yml docker-compose.yml\ndocker compose up -d\n```\n\nEnter the container:\n\n```shell\ndocker compose exec api bash\n```\n\nDownload the model:\n\n```shell\nhuggingface-cli download fishaudio/fish-speech-1.5 --local-dir models/fish-speech-1.5/\n```\n\n### Manual Setup\n\n```shell\napt install cmake portaudio19-dev\n```\n\nSet up a virtual environment and install dependencies:\n\n```shell\npython3.12 -m venv venv\npip install -r requirements.txt\n```\n\nDownload model:\n\n```shell\nhuggingface-cli download fishaudio/fish-speech-1.5 --local-dir models/fish-speech-1.5/\n```\n\nRun API-server:\n\n```shell\npython main.py\n```\n\n## 🧪 Testing the API\n\n### Generate speech with default voice\n\n```shell\ncurl http://localhost:8000/audio/speech \\\n  -X POST \\\n  -F model=\"fish-speech-1.5\" \\\n  -F input=\"Hello, this is a test of Fish Speech API\" \\\n  --output \"speech.wav\"\n```\n\nIn JSON format:\n\n```shell\ncurl http://localhost:8000/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n      \"model\": \"fish-speech-1.5\",\n      \"input\": \"Hello, this is a test of Fish Speech API\"\n  }' \\\n  --output \"speech.wav\"\n```\n\n### Generate speech with example voice\n\n```shell\ncurl http://gpu02:13000/audio/speech \\\n  -X POST \\\n  -F model=\"fish-speech-1.5\" \\\n  -F voice=\"english-nice\" \\\n  -F input=\"Dr. Eleanor Whitaker, a quantum physicist from Edinburgh, surreptitiously analyzed the enigmatic hieroglyphs while humming Für Elise —her quizzical expression mirrored the cryptic symbols perplexing arrangement, yet she remained determined to decipher their archaic secrets.\" \\\n  --output \"speech.wav\"\n```\n\nIn JSON format:\n\n```shell\ncurl http://localhost:8000/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n      \"model\": \"fish-speech-1.5\",\n      \"voice\": \"english-nice\",\n      \"input\": \"Dr. Eleanor Whitaker, a quantum physicist from Edinburgh, surreptitiously analyzed the enigmatic hieroglyphs while humming Für Elise —her quizzical expression mirrored the cryptic symbols perplexing arrangement, yet she remained determined to decipher their archaic secrets.\"\n  }' \\\n  --output \"speech.wav\"\n```\n\n### Generate speech with reference voice\n\n```shell\ncurl http://localhost:8000/audio/speech \\\n  -X POST \\\n  -H 'Content-Type: multipart/form-data' \\\n  -F model=\"fish-speech-1.5\" \\\n  -F input=\"Dr. Eleanor Whitaker, a quantum physicist from Edinburgh, surreptitiously analyzed the enigmatic hieroglyphs while humming Für Elise —her quizzical expression mirrored the cryptic symbols perplexing arrangement, yet she remained determined to decipher their archaic secrets.\" \\\n  -F reference_audio=\"@voice-viola.wav\" \\\n  --output \"speech.wav\"\n```\n\nIn JSON format:\n\n```shell\ncurl http://localhost:8000/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n      \"model\": \"fish-speech-1.5\",\n      \"input\": \"Dr. Eleanor Whitaker, a quantum physicist from Edinburgh, surreptitiously analyzed the enigmatic hieroglyphs while humming Für Elise —her quizzical expression mirrored the cryptic symbols perplexing arrangement, yet she remained determined to decipher their archaic secrets.\",\n      \"reference_audio\": \"=base64...\"\n  }' \\\n  --output \"speech.wav\"\n```\n\n#### Advanced settings\n\n```shell\ncurl http://localhost:8000/audio/speech \\\n  -X POST \\\n  -H 'Content-Type: multipart/form-data' \\\n  -F model=\"fish-speech-1.5\" \\\n  -F input=\"Dr. Eleanor Whitaker, a quantum physicist from Edinburgh, surreptitiously analyzed the enigmatic hieroglyphs while humming Für Elise —her quizzical expression mirrored the cryptic symbols perplexing arrangement, yet she remained determined to decipher their archaic secrets.\" \\\n  -F top_p=\"0.1\" \\\n  -F repetition_penalty=\"1.3\" \\\n  -F temperature=\"0.75\" \\\n  -F chunk_length=\"150\" \\\n  -F max_new_tokens=\"768\" \\\n  -F seed=\"42\" \\\n  -F reference_audio=\"@voice-viola.wav\" \\\n  --output \"speech.wav\"\n```\n\nIn JSON format:\n\n```shell\ncurl http://localhost:8000/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n      \"model\": \"fish-speech-1.5\",\n      \"input\": \"Dr. Eleanor Whitaker, a quantum physicist from Edinburgh, surreptitiously analyzed the enigmatic hieroglyphs while humming Für Elise —her quizzical expression mirrored the cryptic symbols perplexing arrangement, yet she remained determined to decipher their archaic secrets.\",\n      \"top_p\": \"0.1\",\n      \"repetition_penalty\": \"1.3\",\n      \"temperature\": \"0.75\",\n      \"chunk_length\": \"150\",\n      \"max_new_tokens\": \"768\",\n      \"seed\": \"42\",\n      \"reference_audio\": \"=base64...\"\n  }' \\\n  --output \"speech.wav\"\n```\n\n## Links\n\n- https://github.com/fishaudio/fish-speech\n- https://huggingface.co/fishaudio/fish-speech-1.5\n- https://huggingface.co/fishaudio/fish-agent-v0.1-3b\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilfreelancer%2Fdocker-fish-speech-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevilfreelancer%2Fdocker-fish-speech-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevilfreelancer%2Fdocker-fish-speech-server/lists"}