https://github.com/wiseman/py-webrtcvad

Python interface to the WebRTC Voice Activity Detector
https://github.com/wiseman/py-webrtcvad

Last synced: 7 months ago
JSON representation

Python interface to the WebRTC Voice Activity Detector

Host: GitHub
URL: https://github.com/wiseman/py-webrtcvad
Owner: wiseman
License: other
Created: 2016-04-23T05:03:52.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2024-07-04T02:23:24.000Z (over 1 year ago)
Last Synced: 2025-05-13T02:38:47.971Z (7 months ago)
Language: C
Size: 244 KB
Stars: 2,230
Watchers: 49
Forks: 417
Open Issues: 46
Metadata Files:
- Readme: README.rst
- License: LICENSE

Awesome Lists containing this project

awesome-voice-agents - py-webrtcvad - webrtcvad) | Python interface to Google WebRTC Voice Activity Detector. Classic signal processing approach. | 经典轻量方案，快速 | (VAD (Voice Activity Detection) | 语音活动检测 / Core VAD Models | 核心 VAD 模型)
awesome-python-scientific-audio - py-webrtcvad - webrtcvad) [:package:](https://pypi.python.org/pypi/webrtcvad/) - Interface to the WebRTC Voice Activity Detector. (Audio Related Packages)

README

.. image:: https://travis-ci.org/wiseman/py-webrtcvad.svg?branch=master
:target: https://travis-ci.org/wiseman/py-webrtcvad

py-webrtcvad
============

This is a python interface to the WebRTC Voice Activity Detector
(VAD). It is compatible with Python 2 and Python 3.

A `VAD `_
classifies a piece of audio data as being voiced or unvoiced. It can
be useful for telephony and speech recognition.

The VAD that Google developed for the `WebRTC `_
project is reportedly one of the best available, being fast, modern
and free.

How to use it
-------------

0. Install the webrtcvad module::

pip install webrtcvad

1. Create a ``Vad`` object::

import webrtcvad
vad = webrtcvad.Vad()

2. Optionally, set its aggressiveness mode, which is an integer
between 0 and 3. 0 is the least aggressive about filtering out
non-speech, 3 is the most aggressive. (You can also set the mode
when you create the VAD, e.g. ``vad = webrtcvad.Vad(3)``)::

vad.set_mode(1)

3. Give it a short segment ("frame") of audio. The WebRTC VAD only
accepts 16-bit mono PCM audio, sampled at 8000, 16000, 32000 or 48000 Hz.
A frame must be either 10, 20, or 30 ms in duration::

# Run the VAD on 10 ms of silence. The result should be False.
sample_rate = 16000
frame_duration = 10 # ms
frame = b'\x00\x00' * int(sample_rate * frame_duration / 1000)
print 'Contains speech: %s' % (vad.is_speech(frame, sample_rate)

See `example.py
`_ for
a more detailed example that will process a .wav file, find the voiced
segments, and write each one as a separate .wav.

How to run unit tests
---------------------

To run unit tests::

pip install -e ".[dev]"
python setup.py test

History
-------

2.0.10

Fixed memory leak. Thank you, `bond005
`_!

2.0.9

Improved example code. Added WebRTC license.

2.0.8

Fixed Windows compilation errors. Thank you, `xiongyihui
`_!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wiseman/py-webrtcvad

Awesome Lists containing this project

README