https://github.com/mahimairaja/fastrtc-whisper-cpp
A PyPI package that wraps Whisper.cpp for speech-to-text (STT) transcription, compatible with FastRTC
https://github.com/mahimairaja/fastrtc-whisper-cpp
fastrtc hugging-face stt whisper-ai whisper-cpp
Last synced: 3 months ago
JSON representation
A PyPI package that wraps Whisper.cpp for speech-to-text (STT) transcription, compatible with FastRTC
- Host: GitHub
- URL: https://github.com/mahimairaja/fastrtc-whisper-cpp
- Owner: mahimairaja
- License: mit
- Created: 2025-05-22T20:03:52.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-06-06T00:55:04.000Z (10 months ago)
- Last Synced: 2025-09-19T03:28:05.963Z (7 months ago)
- Topics: fastrtc, hugging-face, stt, whisper-ai, whisper-cpp
- Language: Python
- Homepage: https://pypi.org/project/fastrtc-whisper-cpp/
- Size: 800 KB
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# Whisper CPP for FastRTC
A PyPI package that wraps Whisper.cpp for speech-to-text (STT) transcription, compatible with the FastRTC STTModel protocol. This package provides efficient, CPU-based speech recognition using the optimized Whisper.cpp implementation.
## Installation
```bash
pip install fastrtc-whisper-cpp
```
For audio file loading capabilities, install with the audio extras:
```bash
pip install "fastrtc-whisper-cpp[audio]"
```
For development:
```bash
pip install "fastrtc-whisper-cpp[dev]"
```
## Usage
### Basic Usage
```python
from fastrtc_whisper_cpp import get_stt_model
import numpy as np
# Create the model (downloads from HF if not cached)
model = get_stt_model()
# Example: Create a sample audio array (actual audio would come from a file or mic)
sample_rate = 16000
audio_data = np.zeros(16000, dtype=np.float32) # 1 second of silence
# Transcribe
text = model.stt((sample_rate, audio_data))
print(f"Transcription: {text}")
```
### Loading Audio Files
If you've installed with the audio extras:
```python
from fastrtc_whisper_cpp import get_stt_model, load_audio
# Load model
model = get_stt_model()
# Load audio file (automatically resamples to 16kHz)
audio = load_audio("path/to/audio.wav")
# Transcribe
text = model.stt(audio)
print(f"Transcription: {text}")
```
### Using with FastRTC
```python
from fastrtc_whisper_cpp import get_stt_model
# Create the model
whisper_model = get_stt_model()
# Use within FastRTC applications
# (Follow FastRTC documentation for integration details)
```
## Available Models
The package supports various Whisper.cpp models with different sizes and quantization levels:
- English-only models (faster, smaller):
- `tiny.en`, `tiny.en-q5_1`, `tiny.en-q8_0`
- `base.en`, `base.en-q5_1`, `base.en-q8_0`
- `small.en`, `small.en-q5_1`, `small.en-q8_0`
- `medium.en`, `medium.en-q5_0`, `medium.en-q8_0`
- Multilingual models:
- `tiny`, `tiny-q5_1`, `tiny-q8_0`
- `base`, `base-q5_1`, `base-q8_0`
- `small`, `small-q5_1`, `small-q8_0`
- `medium`, `medium-q5_0`, `medium-q8_0`
- `large-v1`
- `large-v2`, `large-v2-q5_0`, `large-v2-q8_0`
- `large-v3`, `large-v3-q5_0`
- `large-v3-turbo`, `large-v3-turbo-q5_0`, `large-v3-turbo-q8_0`
Example:
```python
from fastrtc_whisper_cpp import get_stt_model
# Choose a specific model
model = get_stt_model("medium.en-q8_0")
```
## Advanced Configuration
You can configure the model with specific parameters:
```python
from fastrtc_whisper_cpp import WhisperCppSTT
# Configure with specific model and models directory
model = WhisperCppSTT(
model="medium.en",
models_dir="/path/to/models" # Optional custom models directory
)
```
## Requirements
- Python 3.10+
- numpy
- pywhispercpp
- librosa (optional, for audio file loading)
- click (for CLI features)
## Development
Clone the repository and install in development mode:
```bash
git clone https://github.com/mahimairaja/fastrtc-whisper-cpp.git
cd fastrtc-whisper-cpp
pip install -e ".[dev,audio]"
```
## License
MIT
## GitAds Sponsored
[](https://gitads.dev/v1/ad-track?source=mahimairaja/fastrtc-whisper-cpp@github)