{"id":16277572,"url":"https://github.com/mgonzs13/whisper_ros","last_synced_at":"2025-08-30T04:06:08.008Z","repository":{"id":180159044,"uuid":"635012996","full_name":"mgonzs13/whisper_ros","owner":"mgonzs13","description":"Speech-to-Text based on SileroVAD + whisper.cpp (GGML Whisper) for ROS 2","archived":false,"fork":false,"pushed_at":"2025-06-28T16:35:02.000Z","size":2026,"stargazers_count":78,"open_issues_count":0,"forks_count":19,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-06-28T17:32:33.638Z","etag":null,"topics":["asr","automatic-speech-recognition","ggml","ros2","speech-recognition","speech-to-text","vad","voice-activity-detection","whisper","whisper-cpp"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mgonzs13.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-05-01T19:12:26.000Z","updated_at":"2025-06-28T16:35:05.000Z","dependencies_parsed_at":null,"dependency_job_id":"bafb2db0-38d9-4d41-8988-8478703d5017","html_url":"https://github.com/mgonzs13/whisper_ros","commit_stats":null,"previous_names":["mgonzs13/whisper_ros"],"tags_count":48,"template":false,"template_full_name":null,"purl":"pkg:github/mgonzs13/whisper_ros","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Fwhisper_ros","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Fwhisper_ros/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Fwhisper_ros/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Fwhisper_ros/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mgonzs13","download_url":"https://codeload.github.com/mgonzs13/whisper_ros/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mgonzs13%2Fwhisper_ros/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272800967,"owners_count":24995185,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-30T02:00:09.474Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asr","automatic-speech-recognition","ggml","ros2","speech-recognition","speech-to-text","vad","voice-activity-detection","whisper","whisper-cpp"],"created_at":"2024-10-10T18:55:27.321Z","updated_at":"2025-08-30T04:06:08.002Z","avatar_url":"https://github.com/mgonzs13.png","language":"C++","funding_links":[],"categories":["Research-Grade Frameworks"],"sub_categories":[],"readme":"# whisper_ros\n\nThis repository provides a set of ROS 2 packages to integrate [whisper.cpp](https://github.com/ggerganov/whisper.cpp) into ROS 2 using [audio_common](https://github.com/mgonzs13/audio_common) [4.0.6](https://github.com/mgonzs13/audio_common/releases/tag/4.0.6). Besides, [silero-vad](https://github.com/snakers4/silero-vad) is used to perform VAD (Voice Activity Detection).\n\n\u003cdiv align=\"center\"\u003e\n\n[![License: MIT](https://img.shields.io/badge/GitHub-MIT-informational)](https://opensource.org/license/mit) [![GitHub release](https://img.shields.io/github/release/mgonzs13/whisper_ros.svg)](https://github.com/mgonzs13/whisper_ros/releases) [![Code Size](https://img.shields.io/github/languages/code-size/mgonzs13/whisper_ros.svg?branch=main)](https://github.com/mgonzs13/whisper_ros?branch=main) [![Last Commit](https://img.shields.io/github/last-commit/mgonzs13/whisper_ros.svg)](https://github.com/mgonzs13/whisper_ros/commits/main) [![GitHub issues](https://img.shields.io/github/issues/mgonzs13/whisper_ros)](https://github.com/mgonzs13/whisper_ros/issues) [![GitHub pull requests](https://img.shields.io/github/issues-pr/mgonzs13/whisper_ros)](https://github.com/mgonzs13/whisper_ros/pulls) [![Contributors](https://img.shields.io/github/contributors/mgonzs13/whisper_ros.svg)](https://github.com/mgonzs13/whisper_ros/graphs/contributors) [![Python Formatter Check](https://github.com/mgonzs13/whisper_ros/actions/workflows/python-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/python-formatter.yml?branch=main) [![C++ Formatter Check](https://github.com/mgonzs13/whisper_ros/actions/workflows/cpp-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/cpp-formatter.yml?branch=main)\n\n| ROS 2 Distro |                           Branch                            |                                                                                                         Build status                                                                                                         |                                                                 Docker Image                                                                 | Documentation                                                                                                                                                      |\n| :----------: | :---------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |\n|  **Humble**  | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) |  [![Humble Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/humble-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/humble-docker-build.yml?branch=main)   |  [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-humble-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=humble)  | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) |\n|   **Iron**   | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) |     [![Iron Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/iron-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/iron-docker-build.yml?branch=main)      |    [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-iron-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=iron)    | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) |\n|  **Jazzy**   | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) |    [![Jazzy Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/jazzy-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/jazzy-docker-build.yml?branch=main)    |   [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-jazzy-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=jazzy)   | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) |\n|  **Kilted**  | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) |  [![Kilted Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/kilted-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/kilted-docker-build.yml?branch=main)   |  [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-kilted-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=kilted)  | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) |\n| **Rolling**  | [`main`](https://github.com/mgonzs13/whisper_ros/tree/main) | [![Rolling Build](https://github.com/mgonzs13/whisper_ros/actions/workflows/rolling-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/whisper_ros/actions/workflows/rolling-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-rolling-blue)](https://hub.docker.com/r/mgons/whisper_ros/tags?name=rolling) | [![Doxygen Deployment](https://github.com/mgonzs13/whisper_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/whisper_ros/latest) |\n\n\u003c/div\u003e\n\n## Table of Contents\n\n1. [Related Projects](#related-projects)\n2. [Installation](#installation)\n3. [Docker](#docker)\n4. [Usage](#usage)\n5. [Demos](#demos)\n\n## Related Projects\n\n- [chatbot_ros](https://github.com/mgonzs13/chatbot_ros) \u0026rarr; This chatbot, integrated into ROS 2, uses whisper_ros, to listen to people speech; and [llama_ros](https://github.com/mgonzs13/llama_ros/tree/main), to generate responses. The chatbot is controlled by a state machine created with [YASMIN](https://github.com/uleroboticsgroup/yasmin).\n\n## Installation\n\nTo run whisper_ros with CUDA, first, you must install the [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit). To run SileroVAD with ONNX and CUDA, you must install the [cuDNN](https://developer.nvidia.com/cudnn-downloads).\n\n```shell\ncd ~/ros2_ws/src\ngit clone https://github.com/mgonzs13/audio_common.git\ngit clone https://github.com/mgonzs13/whisper_ros.git\ncd ~/ros2_ws\nrosdep install --from-paths src --ignore-src -r -y\ncolcon build --cmake-args -DGGML_CUDA=ON -DONNX_GPU=ON # To use CUDA on Whisper and on Silero, respectively\n```\n\n## Docker\n\nBuild the whisper_ros docker. Additionally, you can choose to build whisper_ros with CUDA (`USE_CUDA`) and choose the CUDA version (`CUDA_VERSION`). Remember that you have to use `DOCKER_BUILDKIT=0` to compile whisper_ros with CUDA when building the image.\n\n```shell\nDOCKER_BUILDKIT=0 docker build -t whisper_ros --build-arg USE_CUDA=1 --build-arg CUDA_VERSION=12-6 .\n```\n\nRun the docker container. If you want to use CUDA, you have to install the [NVIDIA Container Tollkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) and add `--gpus all`.\n\n```shell\ndocker run -it --rm --gpus all whisper_ros\n```\n\n## Usage\n\nRun Silero for VAD and Whisper for STT:\n\n```shell\nros2 launch whisper_bringup whisper.launch.py\n```\n\nAdd the parameter `silero_vad_use_cuda:=True` to use Silero with CUDA.\n\n## Demos\n\nSend a goal action to listen:\n\n```shell\nros2 action send_goal /whisper/listen whisper_msgs/action/STT \"{}\"\n```\n\nOr try the example of a whisper client:\n\n```shell\nros2 run whisper_demos whisper_demo_node\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmgonzs13%2Fwhisper_ros","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmgonzs13%2Fwhisper_ros","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmgonzs13%2Fwhisper_ros/lists"}