Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/gumblex/whisper_vad

Whisper.cpp Speech-to-text with Voice Acticity Detection
https://github.com/gumblex/whisper_vad

speech-to-text whisper whisper-cpp

Last synced: 3 months ago
JSON representation

Whisper.cpp Speech-to-text with Voice Acticity Detection

Host: GitHub
URL: https://github.com/gumblex/whisper_vad
Owner: gumblex
License: mit
Created: 2022-12-20T11:04:12.000Z (about 2 years ago)
Default Branch: master
Last Pushed: 2024-11-03T05:08:04.000Z (3 months ago)
Last Synced: 2024-11-03T06:17:19.919Z (3 months ago)
Topics: speech-to-text, whisper, whisper-cpp
Language: Python
Homepage:
Size: 2.34 MB
Stars: 11
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        Whisper VAD

===========

[Whisper.cpp](https://github.com/ggerganov/whisper.cpp) Speech-to-Text engine combined with [Silero Voice Activity Detector](https://github.com/snakers4/silero-vad).

This improves transcription speed and quality, and can avoid hallucination of the model.

Run `whisper_vad.py` directly for transcribing any video/audio files into SRT subtitles, or import it as a library.

## Dependencies

* ffmpeg (command)

* openblas (system library)

* cffi

* torch

* scipy

* zhconv: Chinese postprocess

## Build and usage

### Simple

1. `pip install -r requirements.txt`

2. `make`

### Custom Device

1. `git submodule update --init --recursive`

2. `cd whisper.cpp`

3. [Compile whisper.cpp to match your device](https://github.com/ggerganov/whisper.cpp)

   1. `cmake -B build` (add any build options)

   2. `cmake --build build --config Release -j8`

4. `pip install -r requirements.txt`

5. `make`

### Usage

`python3 whisper_vad.py --help` to see usage.