{"id":13405679,"url":"https://github.com/coqui-ai/TTS","last_synced_at":"2025-03-14T10:31:22.204Z","repository":{"id":37038547,"uuid":"265612440","full_name":"coqui-ai/TTS","owner":"coqui-ai","description":"🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production","archived":false,"fork":false,"pushed_at":"2024-08-16T12:07:14.000Z","size":170196,"stargazers_count":34753,"open_issues_count":90,"forks_count":4220,"subscribers_count":288,"default_branch":"dev","last_synced_at":"2024-10-20T10:46:47.074Z","etag":null,"topics":["deep-learning","glow-tts","hifigan","melgan","multi-speaker-tts","python","pytorch","speaker-encoder","speaker-encodings","speech","speech-synthesis","tacotron","text-to-speech","tts","tts-model","vocoder","voice-cloning","voice-conversion","voice-synthesis"],"latest_commit_sha":null,"homepage":"http://coqui.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coqui-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-20T15:45:28.000Z","updated_at":"2024-10-20T10:08:16.000Z","dependencies_parsed_at":"2024-02-10T15:28:40.563Z","dependency_job_id":"6b56dca2-e25f-4eb8-af12-eb6c4636b450","html_url":"https://github.com/coqui-ai/TTS","commit_stats":{"total_commits":4240,"total_committers":173,"mean_commits":"24.508670520231213","dds":0.6955188679245283,"last_synced_commit":"dbf1a08a0d4e47fdad6172e433eeb34bc6b13b4e"},"previous_names":[],"tags_count":98,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coqui-ai%2FTTS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coqui-ai%2FTTS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coqui-ai%2FTTS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coqui-ai%2FTTS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coqui-ai","download_url":"https://codeload.github.com/coqui-ai/TTS/tar.gz/refs/heads/dev","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221458141,"owners_count":16825271,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","glow-tts","hifigan","melgan","multi-speaker-tts","python","pytorch","speaker-encoder","speaker-encodings","speech","speech-synthesis","tacotron","text-to-speech","tts","tts-model","vocoder","voice-cloning","voice-conversion","voice-synthesis"],"created_at":"2024-07-30T19:02:08.268Z","updated_at":"2025-03-14T10:31:22.199Z","avatar_url":"https://github.com/coqui-ai.png","language":"Python","funding_links":[],"categories":["Python","语音识别 tts, stt","User Interaction","Natural Language Processing","\u003cspan id=\"speech\"\u003eSpeech\u003c/span\u003e","Refactored","Music \u0026 Audio","HarmonyOS","语音合成","Text-to-Speech (TTS)","Tools \u0026 Frameworks","3.4 Further Links on Audio Synthesis and Detection","Advanced Topics","TTS (Text-to-Speech) | 文本转语音","Repos","🎙️ Voice \u0026 Audio Tools","Speech Processing","Multimodal","3. **Real-World Applications**","Voice \u0026 Multimodal (local) (16)","Developer Resources","📦 Legacy \u0026 Inactive Projects","Tasks and Methods","🎤 Speech \u0026 Audio","Frameworks \u0026 Libraries","App","Open Source TTS Libraries"],"sub_categories":["Acoustic User Interface","Speech \u0026 Audio","\u003cspan id=\"tool\"\u003eLLM (LLM \u0026 Tool)\u003c/span\u003e","Other Languages","Windows Manager","网络服务_其他","Open-Source Models \u0026 Libraries","Open-source projects","Difference between Watermarking and Cryptography","Voice \u0026 Multimodal","Open Source TTS Models | 开源 TTS 模型","**Speech-to-Text \u0026 Text-to-Speech**","Text-to-Speech","1. Audio","Text To Speech","Speech and Text","Audio \u0026 Speech","Python Libraries"],"readme":"\n## 🐸Coqui.ai News\n- 📣 ⓍTTSv2 is here with 16 languages and better performance across the board.\n- 📣 ⓍTTS fine-tuning code is out. Check the [example recipes](https://github.com/coqui-ai/TTS/tree/dev/recipes/ljspeech).\n- 📣 ⓍTTS can now stream with \u003c200ms latency.\n- 📣 ⓍTTS, our production TTS model that can speak 13 languages, is released [Blog Post](https://coqui.ai/blog/tts/open_xtts), [Demo](https://huggingface.co/spaces/coqui/xtts), [Docs](https://tts.readthedocs.io/en/dev/models/xtts.html)\n- 📣 [🐶Bark](https://github.com/suno-ai/bark) is now available for inference with unconstrained voice cloning. [Docs](https://tts.readthedocs.io/en/dev/models/bark.html)\n- 📣 You can use [~1100 Fairseq models](https://github.com/facebookresearch/fairseq/tree/main/examples/mms) with 🐸TTS.\n- 📣 🐸TTS now supports 🐢Tortoise with faster inference. [Docs](https://tts.readthedocs.io/en/dev/models/tortoise.html)\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"https://static.scarf.sh/a.png?x-pxid=cf317fe7-2188-4721-bc01-124bb5d5dbb2\" /\u003e\n\n## \u003cimg src=\"https://raw.githubusercontent.com/coqui-ai/TTS/main/images/coqui-log-green-TTS.png\" height=\"56\"/\u003e\n\n\n**🐸TTS is a library for advanced Text-to-Speech generation.**\n\n🚀 Pretrained models in +1100 languages.\n\n🛠️ Tools for training new models and fine-tuning existing models in any language.\n\n📚 Utilities for dataset analysis and curation.\n______________________________________________________________________\n\n[![Discord](https://img.shields.io/discord/1037326658807533628?color=%239B59B6\u0026label=chat%20on%20discord)](https://discord.gg/5eXr5seRrv)\n[![License](\u003chttps://img.shields.io/badge/License-MPL%202.0-brightgreen.svg\u003e)](https://opensource.org/licenses/MPL-2.0)\n[![PyPI version](https://badge.fury.io/py/TTS.svg)](https://badge.fury.io/py/TTS)\n[![Covenant](https://camo.githubusercontent.com/7d620efaa3eac1c5b060ece5d6aacfcc8b81a74a04d05cd0398689c01c4463bb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f6e7472696275746f72253230436f76656e616e742d76322e3025323061646f707465642d6666363962342e737667)](https://github.com/coqui-ai/TTS/blob/master/CODE_OF_CONDUCT.md)\n[![Downloads](https://pepy.tech/badge/tts)](https://pepy.tech/project/tts)\n[![DOI](https://zenodo.org/badge/265612440.svg)](https://zenodo.org/badge/latestdoi/265612440)\n\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/aux_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/data_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/docker.yaml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/inference_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/style_check.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/text_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/tts_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/vocoder_tests.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/zoo_tests0.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/zoo_tests1.yml/badge.svg)\n![GithubActions](https://github.com/coqui-ai/TTS/actions/workflows/zoo_tests2.yml/badge.svg)\n[![Docs](\u003chttps://readthedocs.org/projects/tts/badge/?version=latest\u0026style=plastic\u003e)](https://tts.readthedocs.io/en/latest/)\n\n\u003c/div\u003e\n\n______________________________________________________________________\n\n## 💬 Where to ask questions\nPlease use our dedicated channels for questions and discussion. Help is much more valuable if it's shared publicly so that more people can benefit from it.\n\n| Type                            | Platforms                               |\n| ------------------------------- | --------------------------------------- |\n| 🚨 **Bug Reports**              | [GitHub Issue Tracker]                  |\n| 🎁 **Feature Requests \u0026 Ideas** | [GitHub Issue Tracker]                  |\n| 👩‍💻 **Usage Questions**          | [GitHub Discussions]                    |\n| 🗯 **General Discussion**       | [GitHub Discussions] or [Discord]   |\n\n[github issue tracker]: https://github.com/coqui-ai/tts/issues\n[github discussions]: https://github.com/coqui-ai/TTS/discussions\n[discord]: https://discord.gg/5eXr5seRrv\n[Tutorials and Examples]: https://github.com/coqui-ai/TTS/wiki/TTS-Notebooks-and-Tutorials\n\n\n## 🔗 Links and Resources\n| Type                            | Links                               |\n| ------------------------------- | --------------------------------------- |\n| 💼 **Documentation**              | [ReadTheDocs](https://tts.readthedocs.io/en/latest/)\n| 💾 **Installation**               | [TTS/README.md](https://github.com/coqui-ai/TTS/tree/dev#installation)|\n| 👩‍💻 **Contributing**               | [CONTRIBUTING.md](https://github.com/coqui-ai/TTS/blob/main/CONTRIBUTING.md)|\n| 📌 **Road Map**                   | [Main Development Plans](https://github.com/coqui-ai/TTS/issues/378)\n| 🚀 **Released Models**            | [TTS Releases](https://github.com/coqui-ai/TTS/releases) and [Experimental Models](https://github.com/coqui-ai/TTS/wiki/Experimental-Released-Models)|\n| 📰 **Papers**                    | [TTS Papers](https://github.com/erogol/TTS-papers)|\n\n\n## 🥇 TTS Performance\n\u003cp align=\"center\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/coqui-ai/TTS/main/images/TTS-performance.png\" width=\"800\" /\u003e\u003c/p\u003e\n\nUnderlined \"TTS*\" and \"Judy*\" are **internal** 🐸TTS models that are not released open-source. They are here to show the potential. Models prefixed with a dot (.Jofish .Abe and .Janice) are real human voices.\n\n## Features\n- High-performance Deep Learning models for Text2Speech tasks.\n    - Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech).\n    - Speaker Encoder to compute speaker embeddings efficiently.\n    - Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS, ParallelWaveGAN, WaveGrad, WaveRNN)\n- Fast and efficient model training.\n- Detailed training logs on the terminal and Tensorboard.\n- Support for Multi-speaker TTS.\n- Efficient, flexible, lightweight but feature complete `Trainer API`.\n- Released and ready-to-use models.\n- Tools to curate Text2Speech datasets under```dataset_analysis```.\n- Utilities to use and test your models.\n- Modular (but not too much) code base enabling easy implementation of new ideas.\n\n## Model Implementations\n### Spectrogram models\n- Tacotron: [paper](https://arxiv.org/abs/1703.10135)\n- Tacotron2: [paper](https://arxiv.org/abs/1712.05884)\n- Glow-TTS: [paper](https://arxiv.org/abs/2005.11129)\n- Speedy-Speech: [paper](https://arxiv.org/abs/2008.03802)\n- Align-TTS: [paper](https://arxiv.org/abs/2003.01950)\n- FastPitch: [paper](https://arxiv.org/pdf/2006.06873.pdf)\n- FastSpeech: [paper](https://arxiv.org/abs/1905.09263)\n- FastSpeech2: [paper](https://arxiv.org/abs/2006.04558)\n- SC-GlowTTS: [paper](https://arxiv.org/abs/2104.05557)\n- Capacitron: [paper](https://arxiv.org/abs/1906.03402)\n- OverFlow: [paper](https://arxiv.org/abs/2211.06892)\n- Neural HMM TTS: [paper](https://arxiv.org/abs/2108.13320)\n- Delightful TTS: [paper](https://arxiv.org/abs/2110.12612)\n\n### End-to-End Models\n- ⓍTTS: [blog](https://coqui.ai/blog/tts/open_xtts)\n- VITS: [paper](https://arxiv.org/pdf/2106.06103)\n- 🐸 YourTTS: [paper](https://arxiv.org/abs/2112.02418)\n- 🐢 Tortoise: [orig. repo](https://github.com/neonbjb/tortoise-tts)\n- 🐶 Bark: [orig. repo](https://github.com/suno-ai/bark)\n\n### Attention Methods\n- Guided Attention: [paper](https://arxiv.org/abs/1710.08969)\n- Forward Backward Decoding: [paper](https://arxiv.org/abs/1907.09006)\n- Graves Attention: [paper](https://arxiv.org/abs/1910.10288)\n- Double Decoder Consistency: [blog](https://erogol.com/solving-attention-problems-of-tts-models-with-double-decoder-consistency/)\n- Dynamic Convolutional Attention: [paper](https://arxiv.org/pdf/1910.10288.pdf)\n- Alignment Network: [paper](https://arxiv.org/abs/2108.10447)\n\n### Speaker Encoder\n- GE2E: [paper](https://arxiv.org/abs/1710.10467)\n- Angular Loss: [paper](https://arxiv.org/pdf/2003.11982.pdf)\n\n### Vocoders\n- MelGAN: [paper](https://arxiv.org/abs/1910.06711)\n- MultiBandMelGAN: [paper](https://arxiv.org/abs/2005.05106)\n- ParallelWaveGAN: [paper](https://arxiv.org/abs/1910.11480)\n- GAN-TTS discriminators: [paper](https://arxiv.org/abs/1909.11646)\n- WaveRNN: [origin](https://github.com/fatchord/WaveRNN/)\n- WaveGrad: [paper](https://arxiv.org/abs/2009.00713)\n- HiFiGAN: [paper](https://arxiv.org/abs/2010.05646)\n- UnivNet: [paper](https://arxiv.org/abs/2106.07889)\n\n### Voice Conversion\n- FreeVC: [paper](https://arxiv.org/abs/2210.15418)\n\nYou can also help us implement more models.\n\n## Installation\n🐸TTS is tested on Ubuntu 18.04 with **python \u003e= 3.9, \u003c 3.12.**.\n\nIf you are only interested in [synthesizing speech](https://tts.readthedocs.io/en/latest/inference.html) with the released 🐸TTS models, installing from PyPI is the easiest option.\n\n```bash\npip install TTS\n```\n\nIf you plan to code or train models, clone 🐸TTS and install it locally.\n\n```bash\ngit clone https://github.com/coqui-ai/TTS\npip install -e .[all,dev,notebooks]  # Select the relevant extras\n```\n\nIf you are on Ubuntu (Debian), you can also run following commands for installation.\n\n```bash\n$ make system-deps  # intended to be used on Ubuntu (Debian). Let us know if you have a different OS.\n$ make install\n```\n\nIf you are on Windows, 👑@GuyPaddock wrote installation instructions [here](https://stackoverflow.com/questions/66726331/how-can-i-run-mozilla-tts-coqui-tts-training-with-cuda-on-a-windows-system).\n\n\n## Docker Image\nYou can also try TTS without install with the docker image.\nSimply run the following command and you will be able to run TTS without installing it.\n\n```bash\ndocker run --rm -it -p 5002:5002 --entrypoint /bin/bash ghcr.io/coqui-ai/tts-cpu\npython3 TTS/server/server.py --list_models #To get the list of available models\npython3 TTS/server/server.py --model_name tts_models/en/vctk/vits # To start a server\n```\n\nYou can then enjoy the TTS server [here](http://[::1]:5002/)\nMore details about the docker images (like GPU support) can be found [here](https://tts.readthedocs.io/en/latest/docker_images.html)\n\n\n## Synthesizing speech by 🐸TTS\n\n### 🐍 Python API\n\n#### Running a multi-speaker and multi-lingual model\n\n```python\nimport torch\nfrom TTS.api import TTS\n\n# Get device\ndevice = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n\n# List available 🐸TTS models\nprint(TTS().list_models())\n\n# Init TTS\ntts = TTS(\"tts_models/multilingual/multi-dataset/xtts_v2\").to(device)\n\n# Run TTS\n# ❗ Since this model is multi-lingual voice cloning model, we must set the target speaker_wav and language\n# Text to speech list of amplitude values as output\nwav = tts.tts(text=\"Hello world!\", speaker_wav=\"my/cloning/audio.wav\", language=\"en\")\n# Text to speech to a file\ntts.tts_to_file(text=\"Hello world!\", speaker_wav=\"my/cloning/audio.wav\", language=\"en\", file_path=\"output.wav\")\n```\n\n#### Running a single speaker model\n\n```python\n# Init TTS with the target model name\ntts = TTS(model_name=\"tts_models/de/thorsten/tacotron2-DDC\", progress_bar=False).to(device)\n\n# Run TTS\ntts.tts_to_file(text=\"Ich bin eine Testnachricht.\", file_path=OUTPUT_PATH)\n\n# Example voice cloning with YourTTS in English, French and Portuguese\ntts = TTS(model_name=\"tts_models/multilingual/multi-dataset/your_tts\", progress_bar=False).to(device)\ntts.tts_to_file(\"This is voice cloning.\", speaker_wav=\"my/cloning/audio.wav\", language=\"en\", file_path=\"output.wav\")\ntts.tts_to_file(\"C'est le clonage de la voix.\", speaker_wav=\"my/cloning/audio.wav\", language=\"fr-fr\", file_path=\"output.wav\")\ntts.tts_to_file(\"Isso é clonagem de voz.\", speaker_wav=\"my/cloning/audio.wav\", language=\"pt-br\", file_path=\"output.wav\")\n```\n\n#### Example voice conversion\n\nConverting the voice in `source_wav` to the voice of `target_wav`\n\n```python\ntts = TTS(model_name=\"voice_conversion_models/multilingual/vctk/freevc24\", progress_bar=False).to(\"cuda\")\ntts.voice_conversion_to_file(source_wav=\"my/source.wav\", target_wav=\"my/target.wav\", file_path=\"output.wav\")\n```\n\n#### Example voice cloning together with the voice conversion model.\nThis way, you can clone voices by using any model in 🐸TTS.\n\n```python\n\ntts = TTS(\"tts_models/de/thorsten/tacotron2-DDC\")\ntts.tts_with_vc_to_file(\n    \"Wie sage ich auf Italienisch, dass ich dich liebe?\",\n    speaker_wav=\"target/speaker.wav\",\n    file_path=\"output.wav\"\n)\n```\n\n#### Example text to speech using **Fairseq models in ~1100 languages** 🤯.\nFor Fairseq models, use the following name format: `tts_models/\u003clang-iso_code\u003e/fairseq/vits`.\nYou can find the language ISO codes [here](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)\nand learn about the Fairseq models [here](https://github.com/facebookresearch/fairseq/tree/main/examples/mms).\n\n```python\n# TTS with on the fly voice conversion\napi = TTS(\"tts_models/deu/fairseq/vits\")\napi.tts_with_vc_to_file(\n    \"Wie sage ich auf Italienisch, dass ich dich liebe?\",\n    speaker_wav=\"target/speaker.wav\",\n    file_path=\"output.wav\"\n)\n```\n\n### Command-line `tts`\n\n\u003c!-- begin-tts-readme --\u003e\n\nSynthesize speech on command line.\n\nYou can either use your trained model or choose a model from the provided list.\n\nIf you don't specify any models, then it uses LJSpeech based English model.\n\n#### Single Speaker Models\n\n- List provided models:\n\n  ```\n  $ tts --list_models\n  ```\n\n- Get model info (for both tts_models and vocoder_models):\n\n  - Query by type/name:\n    The model_info_by_name uses the name as it from the --list_models.\n    ```\n    $ tts --model_info_by_name \"\u003cmodel_type\u003e/\u003clanguage\u003e/\u003cdataset\u003e/\u003cmodel_name\u003e\"\n    ```\n    For example:\n    ```\n    $ tts --model_info_by_name tts_models/tr/common-voice/glow-tts\n    $ tts --model_info_by_name vocoder_models/en/ljspeech/hifigan_v2\n    ```\n  - Query by type/idx:\n    The model_query_idx uses the corresponding idx from --list_models.\n\n    ```\n    $ tts --model_info_by_idx \"\u003cmodel_type\u003e/\u003cmodel_query_idx\u003e\"\n    ```\n\n    For example:\n\n    ```\n    $ tts --model_info_by_idx tts_models/3\n    ```\n\n  - Query info for model info by full name:\n    ```\n    $ tts --model_info_by_name \"\u003cmodel_type\u003e/\u003clanguage\u003e/\u003cdataset\u003e/\u003cmodel_name\u003e\"\n    ```\n\n- Run TTS with default models:\n\n  ```\n  $ tts --text \"Text for TTS\" --out_path output/path/speech.wav\n  ```\n\n- Run TTS and pipe out the generated TTS wav file data:\n\n  ```\n  $ tts --text \"Text for TTS\" --pipe_out --out_path output/path/speech.wav | aplay\n  ```\n\n- Run a TTS model with its default vocoder model:\n\n  ```\n  $ tts --text \"Text for TTS\" --model_name \"\u003cmodel_type\u003e/\u003clanguage\u003e/\u003cdataset\u003e/\u003cmodel_name\u003e\" --out_path output/path/speech.wav\n  ```\n\n  For example:\n\n  ```\n  $ tts --text \"Text for TTS\" --model_name \"tts_models/en/ljspeech/glow-tts\" --out_path output/path/speech.wav\n  ```\n\n- Run with specific TTS and vocoder models from the list:\n\n  ```\n  $ tts --text \"Text for TTS\" --model_name \"\u003cmodel_type\u003e/\u003clanguage\u003e/\u003cdataset\u003e/\u003cmodel_name\u003e\" --vocoder_name \"\u003cmodel_type\u003e/\u003clanguage\u003e/\u003cdataset\u003e/\u003cmodel_name\u003e\" --out_path output/path/speech.wav\n  ```\n\n  For example:\n\n  ```\n  $ tts --text \"Text for TTS\" --model_name \"tts_models/en/ljspeech/glow-tts\" --vocoder_name \"vocoder_models/en/ljspeech/univnet\" --out_path output/path/speech.wav\n  ```\n\n- Run your own TTS model (Using Griffin-Lim Vocoder):\n\n  ```\n  $ tts --text \"Text for TTS\" --model_path path/to/model.pth --config_path path/to/config.json --out_path output/path/speech.wav\n  ```\n\n- Run your own TTS and Vocoder models:\n\n  ```\n  $ tts --text \"Text for TTS\" --model_path path/to/model.pth --config_path path/to/config.json --out_path output/path/speech.wav\n      --vocoder_path path/to/vocoder.pth --vocoder_config_path path/to/vocoder_config.json\n  ```\n\n#### Multi-speaker Models\n\n- List the available speakers and choose a \u003cspeaker_id\u003e among them:\n\n  ```\n  $ tts --model_name \"\u003clanguage\u003e/\u003cdataset\u003e/\u003cmodel_name\u003e\"  --list_speaker_idxs\n  ```\n\n- Run the multi-speaker TTS model with the target speaker ID:\n\n  ```\n  $ tts --text \"Text for TTS.\" --out_path output/path/speech.wav --model_name \"\u003clanguage\u003e/\u003cdataset\u003e/\u003cmodel_name\u003e\"  --speaker_idx \u003cspeaker_id\u003e\n  ```\n\n- Run your own multi-speaker TTS model:\n\n  ```\n  $ tts --text \"Text for TTS\" --out_path output/path/speech.wav --model_path path/to/model.pth --config_path path/to/config.json --speakers_file_path path/to/speaker.json --speaker_idx \u003cspeaker_id\u003e\n  ```\n\n### Voice Conversion Models\n\n```\n$ tts --out_path output/path/speech.wav --model_name \"\u003clanguage\u003e/\u003cdataset\u003e/\u003cmodel_name\u003e\" --source_wav \u003cpath/to/speaker/wav\u003e --target_wav \u003cpath/to/reference/wav\u003e\n```\n\n\u003c!-- end-tts-readme --\u003e\n\n## Directory Structure\n```\n|- notebooks/       (Jupyter Notebooks for model evaluation, parameter selection and data analysis.)\n|- utils/           (common utilities.)\n|- TTS\n    |- bin/             (folder for all the executables.)\n      |- train*.py                  (train your target model.)\n      |- ...\n    |- tts/             (text to speech models)\n        |- layers/          (model layer definitions)\n        |- models/          (model definitions)\n        |- utils/           (model specific utilities.)\n    |- speaker_encoder/ (Speaker Encoder models.)\n        |- (same)\n    |- vocoder/         (Vocoder models.)\n        |- (same)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoqui-ai%2FTTS","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoqui-ai%2FTTS","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoqui-ai%2FTTS/lists"}