{"id":13460273,"url":"https://github.com/rhasspy/piper","last_synced_at":"2025-12-12T02:30:46.593Z","repository":{"id":65247052,"uuid":"587499842","full_name":"rhasspy/piper","owner":"rhasspy","description":"A fast, local neural text to speech system","archived":false,"fork":false,"pushed_at":"2025-03-03T15:37:38.000Z","size":218406,"stargazers_count":8826,"open_issues_count":408,"forks_count":681,"subscribers_count":89,"default_branch":"master","last_synced_at":"2025-05-06T14:56:03.797Z","etag":null,"topics":["speech-synthesis","text-to-speech","tts"],"latest_commit_sha":null,"homepage":"https://rhasspy.github.io/piper-samples/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rhasspy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-10T22:25:40.000Z","updated_at":"2025-05-06T13:39:46.000Z","dependencies_parsed_at":"2023-10-14T16:16:59.705Z","dependency_job_id":"ab24c212-a8f4-47b3-9c8e-fdfba48d8937","html_url":"https://github.com/rhasspy/piper","commit_stats":{"total_commits":312,"total_committers":23,"mean_commits":"13.565217391304348","dds":"0.40705128205128205","last_synced_commit":"c0670df63daf07070c9be36b5c4bed270ad72383"},"previous_names":["rhasspy/larynx2"],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhasspy%2Fpiper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhasspy%2Fpiper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhasspy%2Fpiper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhasspy%2Fpiper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rhasspy","download_url":"https://codeload.github.com/rhasspy/piper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253969226,"owners_count":21992262,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["speech-synthesis","text-to-speech","tts"],"created_at":"2024-07-31T10:00:38.467Z","updated_at":"2025-12-12T02:30:46.551Z","avatar_url":"https://github.com/rhasspy.png","language":"C++","funding_links":[],"categories":["Financing projects/organizations","C++","Text to Speech","\u003ca name=\"cpp\"\u003e\u003c/a\u003eC++","语言资源库","语音合成","Text-to-Speech (TTS)","Platforms","Agentic AI ![](https://img.shields.io/badge/_-AGENTIC-22d3ee?style=flat-square\u0026logo=openai\u0026logoColor=white)","Advanced Topics","TTS (Text-to-Speech) | 文本转语音","TTS Models","Voice \u0026 Multimodal (local) (16)","1. Local Agents","Other","Open Source TTS Libraries"],"sub_categories":["Linux Software","Imgur","python","网络服务_其他","Open-Source Models \u0026 Libraries","Voice and multimodal agents","Voice \u0026 Multimodal","Open Source TTS Models | 开源 TTS 模型","Audio / Voice Agents","Other","Python Libraries"],"readme":"![Piper logo](etc/logo.png)\n\nA fast, local neural text to speech system that sounds great and is optimized for the Raspberry Pi 4.\nPiper is used in a [variety of projects](#people-using-piper).\n\n``` sh\necho 'Welcome to the world of speech synthesis!' | \\\n  ./piper --model en_US-lessac-medium.onnx --output_file welcome.wav\n```\n\n[Listen to voice samples](https://rhasspy.github.io/piper-samples) and check out a [video tutorial by Thorsten Müller](https://youtu.be/rjq5eZoWWSo)\n\nVoices are trained with [VITS](https://github.com/jaywalnut310/vits/) and exported to the [onnxruntime](https://onnxruntime.ai/).\n\n[![A library from the Open Home Foundation](https://www.openhomefoundation.org/badges/ohf-library.png)](https://www.openhomefoundation.org/)\n\n## Voices\n\nOur goal is to support Home Assistant and the [Year of Voice](https://www.home-assistant.io/blog/2022/12/20/year-of-voice/).\n\n[Download voices](VOICES.md) for the supported languages:\n\n* Arabic (ar_JO)\n* Catalan (ca_ES)\n* Czech (cs_CZ)\n* Welsh (cy_GB)\n* Danish (da_DK)\n* German (de_DE)\n* Greek (el_GR)\n* English (en_GB, en_US)\n* Spanish (es_ES, es_MX)\n* Finnish (fi_FI)\n* French (fr_FR)\n* Hungarian (hu_HU)\n* Icelandic (is_IS)\n* Italian (it_IT)\n* Georgian (ka_GE)\n* Kazakh (kk_KZ)\n* Luxembourgish (lb_LU)\n* Nepali (ne_NP)\n* Dutch (nl_BE, nl_NL)\n* Norwegian (no_NO)\n* Polish (pl_PL)\n* Portuguese (pt_BR, pt_PT)\n* Romanian (ro_RO)\n* Russian (ru_RU)\n* Serbian (sr_RS)\n* Swedish (sv_SE)\n* Swahili (sw_CD)\n* Turkish (tr_TR)\n* Ukrainian (uk_UA)\n* Vietnamese (vi_VN)\n* Chinese (zh_CN)\n\nYou will need two files per voice:\n\n1. A `.onnx` model file, such as [`en_US-lessac-medium.onnx`](https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx)\n2. A `.onnx.json` config file, such as [`en_US-lessac-medium.onnx.json`](https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json)\n\nThe `MODEL_CARD` file for each voice contains important licensing information. Piper is intended for text to speech research, and does not impose any additional restrictions on voice models. Some voices may have restrictive licenses, however, so please review them carefully!\n\n\n## Installation\n\nYou can [run Piper with Python](#running-in-python) or download a binary release:\n\n* [amd64](https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_amd64.tar.gz) (64-bit desktop Linux)\n* [arm64](https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_arm64.tar.gz) (64-bit Raspberry Pi 4)\n* [armv7](https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_armv7.tar.gz) (32-bit Raspberry Pi 3/4)\n\nIf you want to build from source, see the [Makefile](Makefile) and [C++ source](src/cpp).\nYou must download and extract [piper-phonemize](https://github.com/rhasspy/piper-phonemize) to `lib/Linux-$(uname -m)/piper_phonemize` before building.\nFor example, `lib/Linux-x86_64/piper_phonemize/lib/libpiper_phonemize.so` should exist for AMD/Intel machines (as well as everything else from `libpiper_phonemize-amd64.tar.gz`).\n\n\n## Usage\n\n1. [Download a voice](#voices) and extract the `.onnx` and `.onnx.json` files\n2. Run the `piper` binary with text on standard input, `--model /path/to/your-voice.onnx`, and `--output_file output.wav`\n\nFor example:\n\n``` sh\necho 'Welcome to the world of speech synthesis!' | \\\n  ./piper --model en_US-lessac-medium.onnx --output_file welcome.wav\n```\n\nFor multi-speaker models, use `--speaker \u003cnumber\u003e` to change speakers (default: 0).\n\nSee `piper --help` for more options.\n\n### Streaming Audio\n\nPiper can stream raw audio to stdout as its produced:\n\n``` sh\necho 'This sentence is spoken first. This sentence is synthesized while the first sentence is spoken.' | \\\n  ./piper --model en_US-lessac-medium.onnx --output-raw | \\\n  aplay -r 22050 -f S16_LE -t raw -\n```\n\nThis is **raw** audio and not a WAV file, so make sure your audio player is set to play 16-bit mono PCM samples at the correct sample rate for the voice.\n\n### JSON Input\n\nThe `piper` executable can accept JSON input when using the `--json-input` flag. Each line of input must be a JSON object with `text` field. For example:\n\n``` json\n{ \"text\": \"First sentence to speak.\" }\n{ \"text\": \"Second sentence to speak.\" }\n```\n\nOptional fields include:\n\n* `speaker` - string\n    * Name of the speaker to use from `speaker_id_map` in config (multi-speaker voices only)\n* `speaker_id` - number\n    * Id of speaker to use from 0 to number of speakers - 1 (multi-speaker voices only, overrides \"speaker\")\n* `output_file` - string\n    * Path to output WAV file\n    \nThe following example writes two sentences with different speakers to different files:\n\n``` json\n{ \"text\": \"First speaker.\", \"speaker_id\": 0, \"output_file\": \"/tmp/speaker_0.wav\" }\n{ \"text\": \"Second speaker.\", \"speaker_id\": 1, \"output_file\": \"/tmp/speaker_1.wav\" }\n```\n\n\n## People using Piper\n\nPiper has been used in the following projects/papers:\n\n* [Home Assistant](https://github.com/home-assistant/addons/blob/master/piper/README.md)\n* [Rhasspy 3](https://github.com/rhasspy/rhasspy3/)\n* [NVDA - NonVisual Desktop Access](https://www.nvaccess.org/post/in-process-8th-may-2023/#voices)\n* [Image Captioning for the Visually Impaired and Blind: A Recipe for Low-Resource Languages](https://www.techrxiv.org/articles/preprint/Image_Captioning_for_the_Visually_Impaired_and_Blind_A_Recipe_for_Low-Resource_Languages/22133894)\n* [Open Voice Operating System](https://github.com/OpenVoiceOS/ovos-tts-plugin-piper)\n* [JetsonGPT](https://github.com/shahizat/jetsonGPT)\n* [LocalAI](https://github.com/go-skynet/LocalAI)\n* [Lernstick EDU / EXAM: reading clipboard content aloud with language detection](https://lernstick.ch/)\n* [Natural Speech - A plugin for Runelite, an OSRS Client](https://github.com/phyce/rl-natural-speech)\n* [mintPiper](https://github.com/evuraan/mintPiper)\n* [Vim-Piper](https://github.com/wolandark/vim-piper)\n\n## Training\n\nSee the [training guide](TRAINING.md) and the [source code](src/python).\n\nPretrained checkpoints are available on [Hugging Face](https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main)\n\n\n## Running in Python\n\nSee [src/python_run](src/python_run)\n\nInstall with `pip`:\n\n``` sh\npip install piper-tts\n```\n\nand then run:\n\n``` sh\necho 'Welcome to the world of speech synthesis!' | piper \\\n  --model en_US-lessac-medium \\\n  --output_file welcome.wav\n```\n\nThis will automatically download [voice files](https://huggingface.co/rhasspy/piper-voices/tree/v1.0.0) the first time they're used. Use `--data-dir` and `--download-dir` to adjust where voices are found/downloaded.\n\nIf you'd like to use a GPU, install the `onnxruntime-gpu` package:\n\n\n``` sh\n.venv/bin/pip3 install onnxruntime-gpu\n```\n\nand then run `piper` with the `--cuda` argument. You will need to have a functioning CUDA environment, such as what's available in [NVIDIA's PyTorch containers](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frhasspy%2Fpiper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frhasspy%2Fpiper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frhasspy%2Fpiper/lists"}