{"id":28464200,"url":"https://github.com/resemble-ai/chatterbox","last_synced_at":"2025-06-10T20:03:42.001Z","repository":{"id":296163656,"uuid":"971241545","full_name":"resemble-ai/chatterbox","owner":"resemble-ai","description":"SoTA open-source TTS","archived":false,"fork":false,"pushed_at":"2025-06-04T13:54:02.000Z","size":45,"stargazers_count":5924,"open_issues_count":65,"forks_count":653,"subscribers_count":53,"default_branch":"master","last_synced_at":"2025-06-07T05:09:00.063Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://resemble-ai.github.io/chatterbox_demopage/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/resemble-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-23T08:16:38.000Z","updated_at":"2025-06-07T05:06:25.000Z","dependencies_parsed_at":"2025-05-29T09:41:12.720Z","dependency_job_id":null,"html_url":"https://github.com/resemble-ai/chatterbox","commit_stats":null,"previous_names":["resemble-ai/chatterbox"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/resemble-ai%2Fchatterbox","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/resemble-ai%2Fchatterbox/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/resemble-ai%2Fchatterbox/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/resemble-ai%2Fchatterbox/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/resemble-ai","download_url":"https://codeload.github.com/resemble-ai/chatterbox/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/resemble-ai%2Fchatterbox/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259143579,"owners_count":22811904,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-07T05:08:59.522Z","updated_at":"2025-06-10T20:03:41.992Z","avatar_url":"https://github.com/resemble-ai.png","language":"Python","funding_links":[],"categories":["🛠️ 一、工具类项目","Developer Tools","Python","Text-to-Speech (TTS)","others","Repos","语音合成","TTS (Text-to-Speech) | 文本转语音","Colab Notebooks","\u003cspan id=\"speech\"\u003eSpeech\u003c/span\u003e","Artificial Intelligence","2. Open Foundation Models","🔧 Utilities \u0026 Miscellaneous","Text-to-Speech (TTS) Models","AI开源项目","4. Text-to-speech (TTS)"],"sub_categories":["🎵🎬 1.3 音视频处理工具","Python Tools","Open-Source Models \u0026 Libraries","资源传输下载","Open Source TTS Models | 开源 TTS 模型","Chatterbox TTS","\u003cspan id=\"tool\"\u003eLLM (LLM \u0026 Tool)\u003c/span\u003e","Android Launcher","Chatterbox","AI 工具","Open source"],"readme":"\n\u003cimg width=\"1200\" alt=\"cb-big2\" src=\"https://github.com/user-attachments/assets/bd8c5f03-e91d-4ee5-b680-57355da204d1\" /\u003e\n\n# Chatterbox TTS\n\n[![Alt Text](https://img.shields.io/badge/listen-demo_samples-blue)](https://resemble-ai.github.io/chatterbox_demopage/)\n[![Alt Text](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-sm.svg)](https://huggingface.co/spaces/ResembleAI/Chatterbox)\n[![Alt Text](https://static-public.podonos.com/badges/insight-on-pdns-sm-dark.svg)](https://podonos.com/resembleai/chatterbox)\n[![Discord](https://img.shields.io/discord/1377773249798344776?label=join%20discord\u0026logo=discord\u0026style=flat)](https://discord.gg/rJq9cRJBJ6)\n\n_Made with ♥️ by \u003ca href=\"https://resemble.ai\" target=\"_blank\"\u003e\u003cimg width=\"100\" alt=\"resemble-logo-horizontal\" src=\"https://github.com/user-attachments/assets/35cf756b-3506-4943-9c72-c05ddfa4e525\" /\u003e\u003c/a\u003e\n\nWe're excited to introduce Chatterbox, [Resemble AI's](https://resemble.ai) first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs, and is consistently preferred in side-by-side evaluations.\n\nWhether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support **emotion exaggeration control**, a powerful feature that makes your voices stand out. Try it now on our [Hugging Face Gradio app.](https://huggingface.co/spaces/ResembleAI/Chatterbox)\n\nIf you like the model but need to scale or tune it for higher accuracy, check out our competitively priced TTS service (\u003ca href=\"https://resemble.ai\"\u003elink\u003c/a\u003e). It delivers reliable performance with ultra-low latency of sub 200ms—ideal for production use in agents, applications, or interactive media.\n\n# Key Details\n- SoTA zeroshot TTS\n- 0.5B Llama backbone\n- Unique exaggeration/intensity control\n- Ultra-stable with alignment-informed inference\n- Trained on 0.5M hours of cleaned data\n- Watermarked outputs\n- Easy voice conversion script\n- [Outperforms ElevenLabs](https://podonos.com/resembleai/chatterbox)\n\n# Tips\n- **General Use (TTS and Voice Agents):**\n  - The default settings (`exaggeration=0.5`, `cfg_weight=0.5`) work well for most prompts.\n  - If the reference speaker has a fast speaking style, lowering `cfg_weight` to around `0.3` can improve pacing.\n\n- **Expressive or Dramatic Speech:**\n  - Try lower `cfg_weight` values (e.g. `~0.3`) and increase `exaggeration` to around `0.7` or higher.\n  - Higher `exaggeration` tends to speed up speech; reducing `cfg_weight` helps compensate with slower, more deliberate pacing.\n\n\n# Installation\n```shell\npip install chatterbox-tts\n```\n\nAlternatively, you can install from source:\n```shell\n# conda create -yn chatterbox python=3.11\n# conda activate chatterbox\n\ngit clone https://github.com/resemble-ai/chatterbox.git\ncd chatterbox\npip install -e .\n```\nWe developed and tested Chatterbox on Python 3.11 on Debain 11 OS; the versions of the dependencies are pinned in `pyproject.toml` to ensure consistency. You can modify the code or dependencies in this installation mode.\n\n\n# Usage\n```python\nimport torchaudio as ta\nfrom chatterbox.tts import ChatterboxTTS\n\nmodel = ChatterboxTTS.from_pretrained(device=\"cuda\")\n\ntext = \"Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill.\"\nwav = model.generate(text)\nta.save(\"test-1.wav\", wav, model.sr)\n\n# If you want to synthesize with a different voice, specify the audio prompt\nAUDIO_PROMPT_PATH = \"YOUR_FILE.wav\"\nwav = model.generate(text, audio_prompt_path=AUDIO_PROMPT_PATH)\nta.save(\"test-2.wav\", wav, model.sr)\n```\nSee `example_tts.py` and `example_vc.py` for more examples.\n\n# Supported Lanugage\nCurrenlty only English.\n\n# Acknowledgements\n- [Cosyvoice](https://github.com/FunAudioLLM/CosyVoice)\n- [Real-Time-Voice-Cloning](https://github.com/CorentinJ/Real-Time-Voice-Cloning)\n- [HiFT-GAN](https://github.com/yl4579/HiFTNet)\n- [Llama 3](https://github.com/meta-llama/llama3)\n- [S3Tokenizer](https://github.com/xingchensong/S3Tokenizer)\n\n# Built-in PerTh Watermarking for Responsible AI\n\nEvery audio file generated by Chatterbox includes [Resemble AI's Perth (Perceptual Threshold) Watermarker](https://github.com/resemble-ai/perth) - imperceptible neural watermarks that survive MP3 compression, audio editing, and common manipulations while maintaining nearly 100% detection accuracy.\n\n\n## Watermark extraction\n\nYou can look for the watermark using the following script.\n\n```python\nimport perth\nimport librosa\n\nAUDIO_PATH = \"YOUR_FILE.wav\"\n\n# Load the watermarked audio\nwatermarked_audio, sr = librosa.load(AUDIO_PATH, sr=None)\n\n# Initialize watermarker (same as used for embedding)\nwatermarker = perth.PerthImplicitWatermarker()\n\n# Extract watermark\nwatermark = watermarker.get_watermark(watermarked_audio, sample_rate=sr)\nprint(f\"Extracted watermark: {watermark}\")\n# Output: 0.0 (no watermark) or 1.0 (watermarked)\n```\n\n\n# Official Discord\n\n👋 Join us on [Discord](https://discord.gg/rJq9cRJBJ6) and let's build something awesome together!\n\n# Disclaimer\nDon't use this model to do bad things. Prompts are sourced from freely available data on the internet.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fresemble-ai%2Fchatterbox","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fresemble-ai%2Fchatterbox","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fresemble-ai%2Fchatterbox/lists"}