https://github.com/numq/voice-activity-detection

JVM library for voice activity detection written in Kotlin based on C library fvad and Silero
https://github.com/numq/voice-activity-detection

cpp fvad java jni jvm kotlin libfvad ml onnx silero silero-vad vad voice-activity-detection

Last synced: 7 months ago
JSON representation

JVM library for voice activity detection written in Kotlin based on C library fvad and Silero

Host: GitHub
URL: https://github.com/numq/voice-activity-detection
Owner: numq
License: apache-2.0
Created: 2024-11-25T22:56:14.000Z (10 months ago)
Default Branch: master
Last Pushed: 2025-03-01T15:44:25.000Z (7 months ago)
Last Synced: 2025-03-01T16:31:05.289Z (7 months ago)
Topics: cpp, fvad, java, jni, jvm, kotlin, libfvad, ml, onnx, silero, silero-vad, vad, voice-activity-detection
Language: Kotlin
Homepage:
Size: 2.49 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Voice Activity Detection

JVM library for voice activity detection written in Kotlin based on the C

library [libfvad](https://github.com/dpirch/libfvad) and ML model [Silero](https://github.com/snakers4/silero-vad)

### See also

- [Stretch](https://github.com/numq/stretch) *to change the speed of audio without changing the pitch*

- [Speech recognition](https://github.com/numq/speech-recognition) *to transcribe audio to text*

- [Speech generation](https://github.com/numq/speech-generation) *to generate voice audio from text*

- [Text generation](https://github.com/numq/text-generation) *to generate text from prompt*

- [Noise reduction](https://github.com/numq/noise-reduction) *to remove noise from audio*

## When to use

> [!NOTE]

> For best results, it is recommended to apply noise reduction to the input data.

### libfvad

Detects any audio activity, regardless of the sound type. The detection behavior depends on the selected mode. Suitable

for general voice activity detection.

### Silero

Detects voice activity specifically containing human speech. Best for speech-focused tasks like transcription and

voice-controlled systems.

## Features

- Detects voice activity in PCM audio data

- Supports any sampling rate and number of channels due to resampling and downmixing

- Supports different detection modes to balance between sensitivity and accuracy (fvad)

## Installation

- Download latest [release](https://github.com/numq/voice-activity-detection/releases)

- Add library dependency

   ```kotlin

   dependencies {

        implementation(file("/path/to/jar"))

   }

   ```

### libfvad

- Unzip binaries

### Silero

- Add ONNX dependency

   ```kotlin

   dependencies {

        implementation("com.microsoft.onnxruntime:onnxruntime:1.20.0")

   }

   ```

## Usage

> See the [example](example) module for implementation details

### TL;DR

- Call `detect` to process the input data, use `isContinuous = true` with streaming audio

### Step-by-step

- Load binaries if you are going to use fvad

   ```kotlin

   VoiceActivityDetection.Fvad.load(libfvad = "/path/to/libfvad", voiceActivityDetection = "/path/to/voice-activity-detection")

   ```

- Create an instance

  ### fvad

  ```kotlin

  VoiceActivityDetection.Fvad.create()

  ```

  ### Silero

  ```kotlin

  VoiceActivityDetection.Silero.create()

  ```

- Call `inputSizeForMillis` to get the input data size for N milliseconds

- Call `minimumInputSize` to get the audio producer buffer size for real-time detection

- Call `detect` passing the input data, sample rate and number of channels as arguments

- Call `reset` to reset the internal state - for example when the audio source changes

- Call `close` to release resources

## Requirements

- JVM version 9 or higher

## License

This project is licensed under the [Apache License 2.0](LICENSE)

## Acknowledgments

- [libfvad](https://github.com/dpirch/libfvad)

- [Silero](https://github.com/snakers4/silero-vad)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/numq/voice-activity-detection

Awesome Lists containing this project

README