https://github.com/gorkemkaramolla/whisper-run

Faster Whisper with Speaker Diarization
https://github.com/gorkemkaramolla/whisper-run

distil-whisper faster-whisper openai pyannote speaker-diarization speech-recognition transcription whisper whisper-large

Last synced: 8 months ago
JSON representation

Faster Whisper with Speaker Diarization

Host: GitHub
URL: https://github.com/gorkemkaramolla/whisper-run
Owner: gorkemkaramolla
License: apache-2.0
Created: 2024-06-26T14:40:37.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-10-17T12:53:46.000Z (12 months ago)
Last Synced: 2025-01-30T18:05:37.070Z (8 months ago)
Topics: distil-whisper, faster-whisper, openai, pyannote, speaker-diarization, speech-recognition, transcription, whisper, whisper-large
Language: Python
Homepage: https://pypi.org/project/whisper-run/
Size: 8.32 MB
Stars: 5
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Whisper-Run

Whisper-Run is a pip CLI tool for processing audio files using Whisper models with speaker diarization capabilities. The tool allows you to process audio files, select models for audio processing, and save the results in JSON format.

It uses the OpenAI-Whisper model implementation from [OpenAI Whisper](https://github.com/openai/whisper), based on the ctranslate2 library from [faster-whisper](https://github.com/SYSTRAN/faster-whisper), and [pyannote's speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1). Check their documentation if needed.

## Installation

To install Whisper-Run, run the following command:

```bash
pip install whisper-run
```

## Usage

You can call Whisper-Run from the command line using the following syntax:

```bash
whisper-run --file_path=
```

### Example

To process an audio file using the CPU and a specific file path:

```bash
whisper-run --device=cpu --file_path=your_file_path
```

When you run the command, you'll be prompted to select a model for audio processing:

```
[?] Select a model for audio processing:
> distil-large-v3
distil-large-v2
large-v3
large-v2
large
medium
small
base
tiny
```

### Flags

- `--device`: Specify the device to use for processing (e.g., `cpu` or `cuda`).
- `--file_path`: Specify the path to the audio file you want to process.
- `--hf_auth_token`: Optional. Pass the Hugging Face Auth Token or set the `HF_AUTH_TOKEN` environment variable.

### Programmatic Usage

You can also use Whisper-Run programmatically in your Python scripts. Below is a basic usage example demonstrating how to use the Whisper-Run library:

#### Example Script

```python
from whisper_run import AudioProcessor

def main():
processor = AudioProcessor(file_path="your_file_path",
device="cpu",
model_name="large-v3"
)
processor.process()

if __name__ == "__main__":
main()
```

### Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

### License

This project is licensed under the Apache 2.0 License.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gorkemkaramolla/whisper-run

Awesome Lists containing this project

README