{"id":16277575,"url":"https://github.com/mgonzs13/audio_common","last_synced_at":"2025-09-17T20:31:35.917Z","repository":{"id":172766508,"uuid":"429964919","full_name":"mgonzs13/audio_common","owner":"mgonzs13","description":"A PortAudio based audio_common with text to speech for ROS 2","archived":false,"fork":false,"pushed_at":"2025-06-28T16:05:40.000Z","size":2472,"stargazers_count":21,"open_issues_count":0,"forks_count":14,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-09-05T05:45:36.282Z","etag":null,"topics":["audio","espeak","pyaudio","ros2","text-to-speech","tts"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mgonzs13.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-11-19T23:51:45.000Z","updated_at":"2025-09-02T03:44:21.000Z","dependencies_parsed_at":"2024-01-04T19:25:36.435Z","dependency_job_id":"b8048e2e-29a6-4636-8633-4de225426e63","html_url":"https://github.com/mgonzs13/audio_common","commit_stats":null,"previous_names":["mgonzs13/audio_common"],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/mgonzs13/audio_common","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Faudio_common","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Faudio_common/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Faudio_common/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Faudio_common/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mgonzs13","download_url":"https://codeload.github.com/mgonzs13/audio_common/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Faudio_common/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275658682,"owners_count":25504776,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-17T02:00:09.119Z","response_time":84,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio","espeak","pyaudio","ros2","text-to-speech","tts"],"created_at":"2024-10-10T18:55:27.682Z","updated_at":"2025-09-17T20:31:35.887Z","avatar_url":"https://github.com/mgonzs13.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# audio_capture\n\nThis repositiory provides a set of ROS 2 packages for audio. It provides a C++ version to capture and play audio data using PortAudio.\n\n\u003cdiv align=\"center\"\u003e\n\n[![License: MIT](https://img.shields.io/badge/GitHub-MIT-informational)](https://opensource.org/license/mit) [![GitHub release](https://img.shields.io/github/release/mgonzs13/audio_common.svg)](https://github.com/mgonzs13/audio_common/releases) [![Code Size](https://img.shields.io/github/languages/code-size/mgonzs13/audio_common.svg?branch=main)](https://github.com/mgonzs13/audio_common?branch=main) [![Last Commit](https://img.shields.io/github/last-commit/mgonzs13/audio_common.svg)](https://github.com/mgonzs13/audio_common/commits/main) [![GitHub issues](https://img.shields.io/github/issues/mgonzs13/audio_common)](https://github.com/mgonzs13/audio_common/issues) [![GitHub pull requests](https://img.shields.io/github/issues-pr/mgonzs13/audio_common)](https://github.com/mgonzs13/audio_common/pulls) [![Contributors](https://img.shields.io/github/contributors/mgonzs13/audio_common.svg)](https://github.com/mgonzs13/audio_common/graphs/contributors) [![C++ Formatter Check](https://github.com/mgonzs13/audio_common/actions/workflows/cpp-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/cpp-formatter.yml?branch=main)\n\n| ROS 2 Distro |                            Branch                            |                                                                                                           Build status                                                                                                            |                                                                  Docker Image                                                                   | Documentation                                                                                                                                                        |\n| :----------: | :----------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n|   **Foxy**   | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |       [![Foxy Build](https://github.com/mgonzs13/audio_common/actions/workflows/foxy-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/foxy-docker-build.yml?branch=main)       |     [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-foxy-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=foxy)     | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |\n| **Galactic** | [`main`](https://github.com/mgonzs13/audio_common/tree/main) | [![Galactic Build](https://github.com/mgonzs13/audio_common/actions/workflows/galactic-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/galactic-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-galactic-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=galactic) | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |\n|  **Humble**  | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |    [![Humble Build](https://github.com/mgonzs13/audio_common/actions/workflows/humble-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/humble-docker-build.yml?branch=main)    |   [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-humble-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=humble)   | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |\n|   **Iron**   | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |       [![Iron Build](https://github.com/mgonzs13/audio_common/actions/workflows/iron-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/iron-docker-build.yml?branch=main)       |     [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-iron-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=iron)     | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |\n|  **Jazzy**   | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |     [![Jazzy Build](https://github.com/mgonzs13/audio_common/actions/workflows/jazzy-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/jazzy-docker-build.yml?branch=main)      |    [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-jazzy-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=jazzy)    | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |\n|  **Kilted**  | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |    [![Kilted Build](https://github.com/mgonzs13/audio_common/actions/workflows/kilted-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/kilted-docker-build.yml?branch=main)    |   [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-kilted-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=kilted)   | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |\n| **Rolling**  | [`main`](https://github.com/mgonzs13/audio_common/tree/main) |  [![Rolling Build](https://github.com/mgonzs13/audio_common/actions/workflows/rolling-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/audio_common/actions/workflows/rolling-docker-build.yml?branch=main)   |  [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-rolling-blue)](https://hub.docker.com/r/mgons/audio_common/tags?name=rolling)  | [![Doxygen Deployment](https://github.com/mgonzs13/audio_common/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/audio_common/latest) |\n\n\u003c/div\u003e\n\n## Table of Contents\n\n1. [Installation](#installation)\n2. [Docker](#docker)\n3. [Nodes](#nodes)\n4. [Demos](#demos)\n\n## Installation\n\n```shell\ncd ~/ros2_ws/src\ngit clone https://github.com/mgonzs13/audio_common.git\ncd ~/ros2_ws\nrosdep install --from-paths src --ignore-src -r -y\ncolcon build\n```\n\n## Docker\n\nYou can create a docker image to test audio_common. Use the following command inside the directory of audio_common.\n\n```shell\ndocker build -t audio_common .\n```\n\nAfter the image is created, run a docker container with the following command.\n\n```shell\ndocker run -it --rm --device /dev/snd audio_common\n```\n\n## Nodes\n\n### audio_capturer_node\n\nNode to obtain audio data from a microphone and publish it into the `audio` topic.\n\n\u003cdetails\u003e\n\u003csummary\u003eClick to expand\u003c/summary\u003e\n\n#### Parameters\n\n- **format**: Specifies the audio format to be used for capturing. Possible values are:\n\n  - `1` (paFloat32 - 32-bit floating point)\n  - `2` (paInt32 - 32-bit integer)\n  - `8` (paInt16 - 16-bit integer)\n  - `16` (paInt8 - 8-bit integer)\n  - `32` (paUInt8 - 8-bit unsigned integer)\n\n  Default: `8` (paInt16)\n\n  The integer values correspond to PortAudio sample format flags.\n\n- **channels**: The number of audio channels to capture. Typically, `1` for mono and `2` for stereo. Default: `1`\n\n- **rate**: The sample rate that is how many samples per second should be captured. Default: `16000`\n\n- **chunk**: The size of each audio frame. Default: `512`\n\n- **device**: The ID of the audio input device. A value of `-1` indicates that the default audio input device should be used. Default: `-1`\n\n- **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `\"\"`\n\n#### ROS 2 Interfaces\n\n- **audio**: Topic to publish the audio data captured from the microphone. Type: `audio_common_msgs/msg/AudioStamped`\n\n\u003c/details\u003e\n\n### audio_player_node\n\nNode to play the audio data obtained from the `audio` topic.\n\n\u003cdetails\u003e\n\u003csummary\u003eClick to expand\u003c/summary\u003e\n\n#### Parameters\n\n- **channels**: The number of audio channels to play. Typically, `1` for mono and `2` for stereo. Default: `2`\n\n  - The node automatically handles conversion between mono and stereo formats if needed.\n\n- **device**: The ID of the audio output device. A value of `-1` indicates that the default audio output device should be used. Default: `-1`\n\n#### ROS 2 Interfaces\n\n- **audio**: Topic subscriber to get the audio data to be played. Type: `audio_common_msgs/msg/AudioStamped`\n\n\u003c/details\u003e\n\n### music_node\n\nNode to play music from audio files in `wav` format.\n\n\u003cdetails\u003e\n\u003csummary\u003eClick to expand\u003c/summary\u003e\n\n#### Parameters\n\n- **chunk**: The size of each audio frame. Default: `2048`\n\n- **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `\"\"`\n\n#### ROS 2 Interfaces\n\n- **audio**: Topic to publish the audio data from the files. Type: `audio_common_msgs/msg/AudioStamped`\n\n- **music_play**: Service to play audio files. Type: `audio_common_msgs/srv/MusicPlay`\n\n  - Parameters:\n    - `audio`: Name of a built-in audio sample (e.g., \"elevator\")\n    - `file_path`: Path to a custom WAV file (ignored if audio is specified)\n    - `loop`: Boolean to indicate if the audio should loop. Default: `false`\n\n- **music_stop**: Service to stop the currently playing music. Type: `std_srvs/srv/Trigger`\n\n- **music_pause**: Service to pause the currently playing music. Type: `std_srvs/srv/Trigger`\n\n- **music_resume**: Service to resume paused music. Type: `std_srvs/srv/Trigger`\n\n\u003c/details\u003e\n\n### tts_node\n\nNode to generate audio from text (TTS) using espeak.\n\n\u003cdetails\u003e\n\u003csummary\u003eClick to expand\u003c/summary\u003e\n\n#### Parameters\n\n- **chunk**: The size of each audio frame. Default: `4096`\n\n- **frame_id**: An identifier for the audio frame. This can be useful for synchronizing audio data with other data streams. Default: `\"\"`\n\n#### ROS 2 Interfaces\n\n- **audio**: Topic publisher to send the audio data generated by the TTS. Type: `audio_common_msgs/msg/AudioStamped`\n\n- **say**: Action to generate audio data from a text. Type: `audio_common_msgs/action/TTS`\n  - Goal:\n    - `text`: The text to convert to speech\n    - `language`: The language to use for speech synthesis. Default: `\"en\"`\n    - `volume`: The volume of the generated speech (0.0-1.0). Default: `1.0`\n    - `rate`: The speech rate (1.0 is normal speed). Default: `1.0`\n  - Feedback:\n    - `audio`: The audio being currently played\n  - Result:\n    - `text`: The text that was converted to speech\n\n\u003c/details\u003e\n\n## Demos\n\n### Audio Capturer/Player\n\n```shell\nros2 run audio_common audio_capturer_node\n```\n\n```shell\nros2 run audio_common audio_player_node\n```\n\n### TTS\n\n```shell\nros2 run audio_common tts_node\n```\n\n```shell\nros2 run audio_common audio_player_node\n```\n\n```shell\nros2 action send_goal /say audio_common_msgs/action/TTS \"{'text': 'Hello World'}\"\n```\n\nAdvanced TTS example with additional parameters:\n\n```shell\nros2 action send_goal /say audio_common_msgs/action/TTS \"{'text': 'Hello World', 'language': 'en-us', 'volume': 0.8, 'rate': 1.2}\"\n```\n\n### Music Player\n\n```shell\nros2 run audio_common music_node\n```\n\n```shell\nros2 run audio_common audio_player_node\n```\n\nPlay a built-in sample:\n\n```shell\nros2 service call /music_play audio_common_msgs/srv/MusicPlay \"{audio: 'elevator'}\"\n```\n\nPlay a custom WAV file:\n\n```shell\nros2 service call /music_play audio_common_msgs/srv/MusicPlay \"{file_path: '/path/to/your/file.wav'}\"\n```\n\nPlay with looping enabled:\n\n```shell\nros2 service call /music_play audio_common_msgs/srv/MusicPlay \"{audio: 'elevator', loop: true}\"\n```\n\nControl playback:\n\n```shell\nros2 service call /music_pause std_srvs/srv/Trigger \"{}\"\nros2 service call /music_resume std_srvs/srv/Trigger \"{}\"\nros2 service call /music_stop std_srvs/srv/Trigger \"{}\"\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmgonzs13%2Faudio_common","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmgonzs13%2Faudio_common","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmgonzs13%2Faudio_common/lists"}