{"id":29195684,"url":"https://github.com/gptscript-ai/gptspeak","last_synced_at":"2026-02-15T23:05:15.741Z","repository":{"id":259141323,"uuid":"876403043","full_name":"gptscript-ai/gptspeak","owner":"gptscript-ai","description":"Text-to-speech CLI tool and Python library using OpenAI's TTS API","archived":false,"fork":false,"pushed_at":"2025-01-27T22:49:32.000Z","size":48,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-01-27T23:29:45.464Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gptscript-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-21T23:01:57.000Z","updated_at":"2025-01-27T22:49:37.000Z","dependencies_parsed_at":"2024-10-22T23:58:52.965Z","dependency_job_id":null,"html_url":"https://github.com/gptscript-ai/gptspeak","commit_stats":null,"previous_names":["gptscript-ai/gptspeak"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/gptscript-ai/gptspeak","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gptscript-ai%2Fgptspeak","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gptscript-ai%2Fgptspeak/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gptscript-ai%2Fgptspeak/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gptscript-ai%2Fgptspeak/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gptscript-ai","download_url":"https://codeload.github.com/gptscript-ai/gptspeak/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gptscript-ai%2Fgptspeak/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263077632,"owners_count":23410167,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-02T05:05:28.690Z","updated_at":"2026-02-15T23:05:10.694Z","avatar_url":"https://github.com/gptscript-ai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GPTSpeak\n\nGPTSpeak is a tool designed to convert text into high-quality audio files using OpenAI's Text-to-Speech (TTS) API. It can be used both as a command-line interface (CLI) tool and as a Python module. GPTSpeak provides functionality to play the generated audio files directly and concatenate multiple audio files, making it an ideal solution for developers, writers, educators, and content creators who need to transform written content into spoken audio efficiently.\n\nWith GPTSpeak, you can:\n\n- Convert text files or direct text input into natural-sounding audio .\n- Choose from multiple voices and TTS models to suit your needs.\n- Play generated audio files directly from the command line.\n- Concatenate multiple audio files into a single file.\n\nIt's as simple as:\n\n```bash\ngptspeak \"Hello, world!\"\n```\n\n## Features\n\n- **Dual-Use Design**: Use GPTSpeak as a CLI tool for quick operations or as a Python module for integration into larger projects.\n- **Convert Text to Speech**: Transform text files or direct text input into high-quality audio files using OpenAI's TTS API.\n- **Handle Long Texts**: Automatically split long texts into appropriate chunks and combine the resulting audio, allowing for conversion of texts of any length.\n- **Multiple Voice Options**: Choose from a variety of available voices to match the tone of your content.\n- **Model Selection**: Select different TTS models based on quality, speed, or other attributes.\n- **Audio Playback**: Play generated audio files directly from the command line or within your Python scripts.\n- **Audio Concatenation**: Efficiently combine multiple audio files into a single file.\n- **Cross-platform Compatibility**: Works on macOS, Linux, and Windows.\n- **Customizable Output**: Specify output file paths and formats.\n- **Error Handling and Logging**: Robust error detection with informative messages and logging for troubleshooting.\n\n## Table of Contents\n\n- [Installation](#installation)\n  - [Prerequisites](#prerequisites)\n- [Quick Start](#quick-start)\n- [Usage](#usage)\n  - [Setting Up Environment Variables](#setting-up-environment-variables)\n  - [Using GPTSpeak as a Python Package](#using-gptspeak-as-a-python-package)\n  - [Using GPTSpeak via the CLI](#using-gptspeak-via-the-cli)\n    - [CLI Options](#cli-options)\n- [Available Models and Voices](#available-models-and-voices)\n- [Examples](#examples)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Installation\n\nInstall GPTSpeak using pip:\n\n```bash\npip install gptspeak\n```\n\n### Prerequisites\n\nEnsure you have the following installed:\n\n- Python 3.9 or higher\n- ffmpeg (for audio processing)\n\n#### Installing ffmpeg\n\nffmpeg is required for audio processing. You can check if you already have it installed by running `ffmpeg -version` in your terminal/command prompt.\n\n- **macOS (with Homebrew)**:\n\n  ```bash\n  brew install ffmpeg\n  ```\n\n- **Ubuntu/Debian**:\n\n  ```bash\n  sudo apt-get install ffmpeg\n  ```\n\n- **Windows**:\n  1. Download the latest ffmpeg build from the [official ffmpeg website](https://ffmpeg.org/download.html).\n  2. Extract the downloaded package and move the extracted directory to your desired location.\n  3. Add the `bin/` directory from the extracted folder to your system PATH.\n  4. Verify the installation by opening a new command prompt and running `ffmpeg -version`.\n\nAfter installing ffmpeg, you should be ready to use GPTSpeak.\n\n## Quick Start\n\nHere's how you can quickly get started with GPTSpeak:\n\n```bash\n# Set your API key\nexport OPENAI_API_KEY=\"your-openai-api-key\"\n\n# Convert direct text input to audio\ngptspeak \"Hello, world!\" -o hello.mp3\n\n# Play direct text input without saving to a file (turn your audio up) \ngptspeak \"Hello, world!\"\n\n# Convert a text file to audio\ngptspeak convert input.txt -o output.mp3\n\n# Play the generated audio file\ngptspeak play output.mp3\n\n# Concatenate multiple audio files\ngptspeak concat -o combined.mp3 file1.mp3 file2.mp3\n```\n\n## Usage\n\n### Setting Up Environment Variables\n\nBefore using GPTSpeak, set up the API key for OpenAI by setting the appropriate environment variable:\n\n```bash\nexport OPENAI_API_KEY=\"your-openai-api-key\"\n```\n\nYou can set this variable in your shell profile (`~/.bashrc`, `~/.zshrc`, etc.) or include it in your Python script before importing GPTSpeak.\n\n\u003e **Note**: Keep your API key secure and do not expose it in code repositories.\n\n### Using GPTSpeak as a Python Package\n\nHere's an example of how to use GPTSpeak in your Python code:\n\n```python\nimport os\nfrom pathlib import Path\nfrom gptspeak.core.converter import convert_text_to_speech, convert_text_to_speech_direct\nfrom gptspeak.core.player import play_audio\n\n# Set the OpenAI API key\nos.environ[\"OPENAI_API_KEY\"] = \"your-openai-api-key\"\n\n# Convert text file to speech\ninput_file = Path(\"input.txt\")\noutput_file = Path(\"output.mp3\")\nmodel = \"tts-1\"\nvoice = \"alloy\"\n\nconvert_text_to_speech(input_file, output_file, model, voice)\nprint(f\"Audio file created: {output_file}\")\n\n# Convert direct text input to speech\ntext = \"Hello, world!\"\ndirect_output_file = Path(\"hello.mp3\")\n\nconvert_text_to_speech_direct(text, direct_output_file, model, voice)\nprint(f\"Audio file created: {direct_output_file}\")\n\n# Play the generated audio\nplay_audio(output_file)\n```\n\n### Configuring Default Settings\n\nGPTSpeak allows you to configure default settings for the TTS model and voice. You can do this interactively or by specifying the options directly:\n\n```bash\n# Interactive configuration\ngptspeak configure\n\n# Set default model and voice directly\ngptspeak configure -m tts-1-hd -v nova\n```\n\nThe configuration is saved in `~/.gptspeak.ini` and will be used for future conversions unless overridden by command-line options.\n\n### Using GPTSpeak via the CLI\n\nWhen using the command-line interface, ensure you've set the appropriate environment variable for the OpenAI API key.\n\nTo convert direct text input to audio:\n\n```bash\ngptspeak \"Hello, world!\" -o hello.mp3 -m tts-1 -v alloy\n```\n\nTo convert a text file to audio:\n\n```bash\ngptspeak convert input.txt -o output.mp3 -m tts-1 -v alloy\n```\n\nTo concatenate multiple audio files:\n\n```bash\ngptspeak concat -o combined.mp3 file1.mp3 file2.mp3 file3.mp3\n```\n\nTo play an audio file:\n\n```bash\ngptspeak play output.mp3\n```\n\n#### CLI Options\n\nFor direct text input and the `convert` command:\n\n- `-o`, `--output`: Output audio file path (default: speech.mp3)\n- `-m`, `--model`: TTS model to use (default: tts-1)\n- `-v`, `--voice`: Voice to use for speech (default: alloy)\n\nFor the `play` command:\n\n- No additional options; just provide the path to the audio file.\n\nFor the `concat` command:\n\n- `-o`, `--output`: Output audio file path (default: concatenated.mp3)\n\nFor the `configure` command:\n\n- `-m`, `--model`: Set the default TTS model\n- `-v`, `--voice`: Set the default voice\n\n## Available Models and Voices\n\nGPTSpeak supports the following models and voices from OpenAI's TTS API:\n\n### Models\n\n- `tts-1`: Standard TTS model\n- `tts-1-hd`: High-definition TTS model\n\n### Voices\n\n- `alloy`: Neutral voice\n- `echo`: Soft and gentle voice\n- `fable`: Expressive and dynamic voice\n- `onyx`: Deep and authoritative voice\n- `nova`: Warm and friendly voice\n- `shimmer`: Clear and energetic voice\n\nTo use a specific model and voice, specify them in the command:\n\n```bash\ngptspeak \"Hello, world!\" -o hello.mp3 -m tts-1-hd -v nova\n```\n\n## Examples\n\n### Converting Direct Text Input with Custom Voice\n\n```bash\ngptspeak \"Welcome to GPTSpeak!\" -o welcome.mp3 -v fable\n```\n\nThis command will convert the text \"Welcome to GPTSpeak!\" into an audio file named `welcome.mp3` using the \"fable\" voice.\n\n### Converting a Text File with Custom Voice\n\n```bash\ngptspeak convert story.txt -o narration.mp3 -v echo\n```\n\nThis command will convert the content of `story.txt` into an audio file named `narration.mp3` using the \"echo\" voice.\n\n### Converting a Long Text File\n\n```bash\ngptspeak convert long_story.txt -o long_narration.mp3 -v nova\n```\n\nThis command will convert the content of `long_story.txt` into an audio file named `long_narration.mp3` using the \"nova\" voice. If the text is longer than the API's character limit, GPTSpeak will automatically split it into chunks, process each chunk separately, and combine the results into a single audio file.\n\n### Concatenating Multiple Audio Files\n\n```bash\ngptspeak concat -o full_audiobook.mp3 chapter1.mp3 chapter2.mp3 chapter3.mp3\n```\n\nThis command will combine `chapter1.mp3`, `chapter2.mp3`, and `chapter3.mp3` into a single audio file named `full_audiobook.mp3`.\n\n### Playing an Audio File\n\n```bash\ngptspeak play narration.mp3\n```\n\nThis command will play the `narration.mp3` file.\n\n## Contributing\n\nContributions to GPTSpeak are welcome! If you'd like to contribute, please follow these steps:\n\n1. Fork the repository on GitHub.\n2. Create a new branch for your feature or bugfix.\n3. Make your changes and ensure tests pass.\n4. Submit a pull request with a clear description of your changes.\n\nPlease ensure that your code adheres to the existing style conventions and passes all tests.\n\n## License\n\nGPTSpeak is licensed under the Apache-2.0 License. See [LICENSE](LICENSE) for more information.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgptscript-ai%2Fgptspeak","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgptscript-ai%2Fgptspeak","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgptscript-ai%2Fgptspeak/lists"}