https://github.com/mgonzs13/audio_common

A PortAudio based audio_common with text to speech for ROS 2
https://github.com/mgonzs13/audio_common
audio espeak pyaudio ros2 text-to-speech tts
Last synced: 9 months ago
JSON representation
A PortAudio based audio_common with text to speech for ROS 2
Host: GitHub
URL: https://github.com/mgonzs13/audio_common
Owner: mgonzs13
License: mit
Created: 2021-11-19T23:51:45.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2025-06-28T16:05:40.000Z (12 months ago)
Last Synced: 2025-09-05T05:45:36.282Z (10 months ago)
Topics: audio, espeak, pyaudio, ros2, text-to-speech, tts
Language: C++
Homepage:
Size: 2.36 MB
Stars: 21
Watchers: 2
Forks: 14
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project

README

          # audio_capture

This repositiory provides a set of ROS 2 packages for audio. It provides a C++ version to capture and play audio data using PortAudio.



[![License: MIT](https://img.shields.io/badge/GitHub-MIT-informational)](https://opensource.org/license/mit) [![GitHub release](https://img.shields.io/github/release/mgonzs13/audio_common.svg)](https://github.com/mgonzs13/audio_common/releases) [![Code Size](https://img.shields.io/github/languages/code-size/mgonzs13/audio_common.svg?branch=main)](https://github.com/mgonzs13/audio_common?branch=main) [![Last Commit](https://img.shields.io/github/last-commit/mgonzs13/audio_common.svg)](https://github.com/mgonzs13/audio_common/commits/main) [![GitHub issues](https://img.shields.io/github/issues/mgonzs13/audio_common)](https://github.com/mgonzs13/audio_common/issues) [![GitHub pull requests](https://img.shields.io/github/issues-pr/mgonzs13/audio_common)](https://github.com/mgonzs13/audio_common/pulls) [![Contributors](https://img.shields.io/github/contributors/mgonzs13/audio_common.svg)](https://github.com/mgonzs13/audio_common/graphs/contributors) [![C++ Formatter Check](https://github.com/mgonzs13/audio_common/actions/workflows/cpp-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/cpp-formatter.yml?branch=main)

| ROS 2 Distro |                            Branch                            |                                                                                                           Build status                                                                                                            |                                                                  Docker Image                                                                   | Documentation                                                                                                                                                        |

| :----------: | :----------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |

|   **Foxy**   | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |       [![Foxy Build](https://github.com/mgonzs13/audio_common/actions/workflows/foxy-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/foxy-docker-build.yml?branch=main)       |     [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-foxy-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=foxy)     | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |

| **Galactic** | [`main`](https://github.com/mgonzs13/audio_common/tree/main) | [![Galactic Build](https://github.com/mgonzs13/audio_common/actions/workflows/galactic-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/galactic-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-galactic-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=galactic) | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |

|  **Humble**  | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |    [![Humble Build](https://github.com/mgonzs13/audio_common/actions/workflows/humble-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/humble-docker-build.yml?branch=main)    |   [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-humble-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=humble)   | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |

|   **Iron**   | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |       [![Iron Build](https://github.com/mgonzs13/audio_common/actions/workflows/iron-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/iron-docker-build.yml?branch=main)       |     [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-iron-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=iron)     | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |

|  **Jazzy**   | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |     [![Jazzy Build](https://github.com/mgonzs13/audio_common/actions/workflows/jazzy-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/jazzy-docker-build.yml?branch=main)      |    [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-jazzy-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=jazzy)    | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |

|  **Kilted**  | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |    [![Kilted Build](https://github.com/mgonzs13/audio_common/actions/workflows/kilted-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/kilted-docker-build.yml?branch=main)    |   [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-kilted-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=kilted)   | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |

| **Rolling**  | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |  [![Rolling Build](https://github.com/mgonzs13/audio_common/actions/workflows/rolling-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/rolling-docker-build.yml?branch=main)   |  [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-rolling-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=rolling)  | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |



## Table of Contents

1. [Installation](#installation)

2. [Docker](#docker)

3. [Nodes](#nodes)

4. [Demos](#demos)

## Installation

```shell

cd ~/ros2_ws/src

git clone https://github.com/mgonzs13/audio_common.git

cd ~/ros2_ws

rosdep install --from-paths src --ignore-src -r -y

colcon build

```

## Docker

You can create a docker image to test audio_common. Use the following command inside the directory of audio_common.

```shell

docker build -t audio_common .

```

After the image is created, run a docker container with the following command.

```shell

docker run -it --rm --device /dev/snd audio_common

```

## Nodes

### audio_capturer_node

Node to obtain audio data from a microphone and publish it into the `audio` topic.

Click to expand

#### Parameters

- **format**: Specifies the audio format to be used for capturing. Possible values are:

  - `1` (paFloat32 - 32-bit floating point)

  - `2` (paInt32 - 32-bit integer)

  - `8` (paInt16 - 16-bit integer)

  - `16` (paInt8 - 8-bit integer)

  - `32` (paUInt8 - 8-bit unsigned integer)

  Default: `8` (paInt16)

  The integer values correspond to PortAudio sample format flags.

- **channels**: The number of audio channels to capture. Typically, `1` for mono and `2` for stereo. Default: `1`

- **rate**: The sample rate that is how many samples per second should be captured. Default: `16000`

- **chunk**: The size of each audio frame. Default: `512`

- **device**: The ID of the audio input device. A value of `-1` indicates that the default audio input device should be used. Default: `-1`

- **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `""`

#### ROS 2 Interfaces

- **audio**: Topic to publish the audio data captured from the microphone. Type: `audio_common_msgs/msg/AudioStamped`

### audio_player_node

Node to play the audio data obtained from the `audio` topic.

Click to expand

#### Parameters

- **channels**: The number of audio channels to play. Typically, `1` for mono and `2` for stereo. Default: `2`

  - The node automatically handles conversion between mono and stereo formats if needed.

- **device**: The ID of the audio output device. A value of `-1` indicates that the default audio output device should be used. Default: `-1`

#### ROS 2 Interfaces

- **audio**: Topic subscriber to get the audio data to be played. Type: `audio_common_msgs/msg/AudioStamped`

### music_node

Node to play music from audio files in `wav` format.

Click to expand

#### Parameters

- **chunk**: The size of each audio frame. Default: `2048`

- **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `""`

#### ROS 2 Interfaces

- **audio**: Topic to publish the audio data from the files. Type: `audio_common_msgs/msg/AudioStamped`

- **music_play**: Service to play audio files. Type: `audio_common_msgs/srv/MusicPlay`

  - Parameters:

    - `audio`: Name of a built-in audio sample (e.g., "elevator")

    - `file_path`: Path to a custom WAV file (ignored if audio is specified)

    - `loop`: Boolean to indicate if the audio should loop. Default: `false`

- **music_stop**: Service to stop the currently playing music. Type: `std_srvs/srv/Trigger`

- **music_pause**: Service to pause the currently playing music. Type: `std_srvs/srv/Trigger`

- **music_resume**: Service to resume paused music. Type: `std_srvs/srv/Trigger`

### tts_node

Node to generate audio from text (TTS) using espeak.

Click to expand

#### Parameters

- **chunk**: The size of each audio frame. Default: `4096`

- **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `""`

#### ROS 2 Interfaces

- **audio**: Topic publisher to send the audio data generated by the TTS. Type: `audio_common_msgs/msg/AudioStamped`

- **say**: Action to generate audio data from a text. Type: `audio_common_msgs/action/TTS`

  - Goal:

    - `text`: The text to convert to speech

    - `language`: The language to use for speech synthesis. Default: `"en"`

    - `volume`: The volume of the generated speech (0.0-1.0). Default: `1.0`

    - `rate`: The speech rate (1.0 is normal speed). Default: `1.0`

  - Feedback:

    - `audio`: The audio being currently played

  - Result:

    - `text`: The text that was converted to speech

## Demos

### Audio Capturer/Player

```shell

ros2 run audio_common audio_capturer_node

```

```shell

ros2 run audio_common audio_player_node

```

### TTS

```shell

ros2 run audio_common tts_node

```

```shell

ros2 run audio_common audio_player_node

```

```shell

ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World'}"

```

Advanced TTS example with additional parameters:

```shell

ros2 action send_goal /say audio_common_msgs/action/TTS "{'text': 'Hello World', 'language': 'en-us', 'volume': 0.8, 'rate': 1.2}"

```

### Music Player

```shell

ros2 run audio_common music_node

```

```shell

ros2 run audio_common audio_player_node

```

Play a built-in sample:

```shell

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator'}"

```

Play a custom WAV file:

```shell

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{file_path: '/path/to/your/file.wav'}"

```

Play with looping enabled:

```shell

ros2 service call /music_play audio_common_msgs/srv/MusicPlay "{audio: 'elevator', loop: true}"

```

Control playback:

```shell

ros2 service call /music_pause std_srvs/srv/Trigger "{}"

ros2 service call /music_resume std_srvs/srv/Trigger "{}"

ros2 service call /music_stop std_srvs/srv/Trigger "{}"

```
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mgonzs13/audio_common

Awesome Lists containing this project

README