Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gumblex/whisper_vad
Whisper.cpp Speech-to-text with Voice Acticity Detection
https://github.com/gumblex/whisper_vad
speech-to-text whisper whisper-cpp
Last synced: about 2 months ago
JSON representation
Whisper.cpp Speech-to-text with Voice Acticity Detection
- Host: GitHub
- URL: https://github.com/gumblex/whisper_vad
- Owner: gumblex
- License: mit
- Created: 2022-12-20T11:04:12.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2024-11-03T05:08:04.000Z (about 2 months ago)
- Last Synced: 2024-11-03T06:17:19.919Z (about 2 months ago)
- Topics: speech-to-text, whisper, whisper-cpp
- Language: Python
- Homepage:
- Size: 2.34 MB
- Stars: 11
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Whisper VAD
===========[Whisper.cpp](https://github.com/ggerganov/whisper.cpp) Speech-to-Text engine combined with [Silero Voice Activity Detector](https://github.com/snakers4/silero-vad).
This improves transcription speed and quality, and can avoid hallucination of the model.Run `whisper_vad.py` directly for transcribing any video/audio files into SRT subtitles, or import it as a library.
## Dependencies
* ffmpeg (command)
* openblas (system library)
* cffi
* torch
* scipy
* zhconv: Chinese postprocess## Build and usage
### Simple
1. `pip install -r requirements.txt`
2. `make`### Custom Device
1. `git submodule update --init --recursive`
2. `cd whisper.cpp`
3. [Compile whisper.cpp to match your device](https://github.com/ggerganov/whisper.cpp)
1. `cmake -B build` (add any build options)
2. `cmake --build build --config Release -j8`
4. `pip install -r requirements.txt`
5. `make`### Usage
`python3 whisper_vad.py --help` to see usage.