{"id":21423952,"url":"https://github.com/picovoice/voice-activity-benchmark","last_synced_at":"2025-07-14T08:31:37.481Z","repository":{"id":37777972,"uuid":"416892991","full_name":"Picovoice/voice-activity-benchmark","owner":"Picovoice","description":"Voice activity engine benchmark framework","archived":false,"fork":false,"pushed_at":"2025-05-08T18:35:14.000Z","size":304,"stargazers_count":13,"open_issues_count":1,"forks_count":3,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-05-08T19:39:57.621Z","etag":null,"topics":["benchmark","benchmark-framework","vad","voice-activity"],"latest_commit_sha":null,"homepage":"https://picovoice.ai/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Picovoice.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-10-13T20:45:54.000Z","updated_at":"2025-05-08T18:35:13.000Z","dependencies_parsed_at":"2023-02-14T12:30:54.496Z","dependency_job_id":null,"html_url":"https://github.com/Picovoice/voice-activity-benchmark","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Picovoice/voice-activity-benchmark","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Picovoice%2Fvoice-activity-benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Picovoice%2Fvoice-activity-benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Picovoice%2Fvoice-activity-benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Picovoice%2Fvoice-activity-benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Picovoice","download_url":"https://codeload.github.com/Picovoice/voice-activity-benchmark/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Picovoice%2Fvoice-activity-benchmark/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265262611,"owners_count":23736428,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","benchmark-framework","vad","voice-activity"],"created_at":"2024-11-22T21:19:00.937Z","updated_at":"2025-07-14T08:31:37.029Z","avatar_url":"https://github.com/Picovoice.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Voice Activity Benchmark\n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/Picovoice/voice-activity-benchmark/blob/master/LICENSE)\n\nMade in Vancouver, Canada by [Picovoice](https://picovoice.ai)\n\nThe purpose of this benchmarking framework is to provide a scientific comparison between different voice activity\nengines in terms of accuracy metrics. While working on [Cobra](https://github.com/Picovoice/Cobra)\nwe noted that there is a need for such a tool to empower customers to make data-driven decisions.\n\n\n# Data\n\n[LibriSpeech](http://www.openslr.org/12/) (test_clean portion) is used as the voice dataset.\nIt can be downloaded from [OpenSLR](http://www.openslr.org/resources/12/test-clean.tar.gz).\n\nIn order to simulate real-world situations, the data is mixed with noise (at 0dB SNR). For this purpose, we use\n[DEMAND](https://asa.scitation.org/doi/abs/10.1121/1.4799597) dataset which has noise recording in 18 different\nenvironments (e.g. kitchen, office, traffic, etc.). Recordings that contained distinct voice data is filtered out.\nIt can be downloaded from [Kaggle](https://www.kaggle.com/aanhari/demand-dataset).\n\n\n# Voice Activity Engines\n\nTwo voice-activity engines are used:\n[py-webrtcvad](https://github.com/wiseman/py-webrtcvad) (Python bindings to the WEBRTC VAD)\nwhich can be installed using [PyPI](https://pypi.org/project/webrtcvad/).\nAnd [Cobra](https://github.com/Picovoice/Cobra) which is included as submodules in this repository.\n\n\n# Metric\n\nWe measured the accuracy of the voice activity engines using false positive and true positive rates.\nThe false positive rate is measured as the number of false positive frames detected over the total number of non-voice frames.\nLikewise, true positive rate is measured as the number of true positive frames detected over the total number of voice-frames.\nUsing these definitions we plot a receiver operating characteristic curve which can be used to characterize performance differences between engines.\n\n\n# Usage\n\n### Prerequisites\n\nThe benchmark has been developed on Ubuntu 18.04 with Python 3.8. Clone the repository using\n\n```bash\ngit clone https://github.com/Picovoice/voice-activity-benchmark.git\n```\n\nMake sure the Python packages in the [requirements.txt](/requirements.txt) are properly installed for your Python\nversion as Python bindings are used for running the engines.\n\n### Running the Benchmark\n\nUsage information can be retrieved via\n\n```bash\npython benchmark.py -h\n```\n\nThe runtime benchmark is contained in the [runtime](/runtime) folder. Use the following commands to build and run the runtime benchmark:\n```bash\ngit clone --recursive https://github.com/Picovoice/cobra.git runtime/cobra\ncmake -S runtime -B runtime/build \u0026\u0026 cmake --build runtime/build\n./runtime/build/cobra_runtime -l {COBRA_LIBRARY_PATH} -a {ACCESS_KEY} -w {TEST_WAVFILE_PATH}\n```\n\n# Results\n\n## Accuracy\n\nBelow is the result of running the benchmark framework. The plot below shows the receiver operating characteristic curve\nof different engines. This plot was generated with the Signal-To-Noise ratio of 0dB.\n\n![](doc/img/summary.png)\n\n\n## Runtime\n\nOn a Raspberry Pi Zero, Cobra measured a realtime factor of `0.05`, or about `5%` CPU usage.\nOn a laptop with an Intel(R) Core(TM) i7-1185G7, Cobra measured a realtime factor of `0.0006`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpicovoice%2Fvoice-activity-benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpicovoice%2Fvoice-activity-benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpicovoice%2Fvoice-activity-benchmark/lists"}