https://github.com/mgonzs13/tts_ros
Text-to-Speech for ROS 2
https://github.com/mgonzs13/tts_ros
ros2 text-to-speech tts
Last synced: about 1 month ago
JSON representation
Text-to-Speech for ROS 2
- Host: GitHub
- URL: https://github.com/mgonzs13/tts_ros
- Owner: mgonzs13
- License: mit
- Created: 2024-03-18T12:38:52.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-22T11:19:19.000Z (12 months ago)
- Last Synced: 2024-05-22T11:28:35.142Z (12 months ago)
- Topics: ros2, text-to-speech, tts
- Language: Python
- Homepage:
- Size: 13.7 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# tts_ros
This repositiory integrates the Python [TTS](https://pypi.org/project/TTS/) (Text-to-Speech) package into ROS 2 using [audio_common](https://github.com/mgonzs13/audio_common) [4.0.4](https://github.com/mgonzs13/audio_common/releases/tag/4.0.4).
[](https://opensource.org/license/mit) [](https://github.com/mgonzs13/tts_ros/releases) [](https://github.com/mgonzs13/tts_ros?branch=main) [](https://github.com/mgonzs13/tts_ros/commits/main) [](https://github.com/mgonzs13/tts_ros/issues) [](https://github.com/mgonzs13/tts_ros/pulls) [](https://github.com/mgonzs13/tts_ros/graphs/contributors) [](https://github.com/mgonzs13/tts_ros/actions/workflows/python-formatter.yml?branch=main)
| ROS 2 Distro | Branch | Build status | Docker Image | Documentation |
| :----------: | :-----------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------: | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Humble** | [`main`](https://github.com/mgonzs13/tts_ros/tree/main) | [](https://github.com/mgonzs13/tts_ros/actions/workflows/humble-docker-build.yml?branch=main) | [](https://hub.docker.com/r/mgons/tts_ros/tags?name=humble) | [](https://mgonzs13.github.io/tts_ros/) |
| **Iron** | [`main`](https://github.com/mgonzs13/tts_ros/tree/main) | [](https://github.com/mgonzs13/tts_ros/actions/workflows/iron-docker-build.yml?branch=main) | [](https://hub.docker.com/r/mgons/tts_ros/tags?name=iron) | [](https://mgonzs13.github.io/tts_ros/) |
| **Jazzy** | [`main`](https://github.com/mgonzs13/tts_ros/tree/main) | [](https://github.com/mgonzs13/tts_ros/actions/workflows/jazzy-docker-build.yml?branch=main) | [](https://hub.docker.com/r/mgons/tts_ros/tags?name=jazzy) | [](https://mgonzs13.github.io/tts_ros/) |
| **Rolling** | [`main`](https://github.com/mgonzs13/tts_ros/tree/main) | [](https://github.com/mgonzs13/tts_ros/actions/workflows/rolling-docker-build.yml?branch=main) | [](https://hub.docker.com/r/mgons/tts_ros/tags?name=rolling) | [](https://mgonzs13.github.io/tts_ros/) |## Table of Contents
1. [Installation](#installation)
2. [Docker](#docker)
3. [Usage](#usage)## Installation
```shell
cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/audio_common.git
git clone https://github.com/mgonzs13/tts_ros.git
pip3 install -r tts_ros/requirements.txt
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build
```## Docker
You can build the tts_ros docker:
```shell
docker build -t tts_ros .
```Then, you can run the docker container:
```shell
docker run -it --rm --device /dev/snd tts_ros
```## Usage
To use this tool you have to run the tts_node. It has the following parameters:
- `chunk`: Size of audio chunks to be sent to the audio player.
- `frame_id`: Frame of for the tts.
- `model`: The tts model. You can check the available models with `tts --list_models`.
- `model_path`: Path to a local model file.
- `config_path`: Path to a config file.
- `vocoder_path`: Path to a vocoder model file.
- `vocoder_config_path`: Path to a config file.
- `device`: The device to run the model same as in torch.
- `speaker_wav`: The wav file to perform voice cloning.
- `speaker`: Which speaker voice to use for multi-speaker models. Check with `tts --model_name --list_language_idx`.
- `stream`: Whether to stream the audio data.### Parameters Format
```shell
ros2 run tts_ros tts_node --ros-args -p chunk:=4096 -p frame_id:="your-frame" -p model:="your-model" -p device:="cpu/cuda" -p speaker_wav:="/path/to/wav/file" -p stream:=False
```## Demo
```shell
ros2 launch tts_bringup tts.launch.py device:="cuda"
``````shell
ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World'}"
```## Streaming Demo
```shell
ros2 launch tts_bringup tts.launch.py stream:=True model:="tts_models/multilingual/multi-dataset/xtts_v2" speaker_wav:="/home/miguel/Downloads/question_1.wav" device:="cuda"
``````shell
ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World, How are you? Can you hear me? What is your favorite color? Do you know the Robot Operating System?'}"
```