Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/erseco/transcriber

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/erseco/transcriber
Owner: erseco
Created: 2024-06-15T08:48:36.000Z (8 months ago)
Default Branch: main
Last Pushed: 2024-06-15T08:50:15.000Z (8 months ago)
Last Synced: 2024-11-18T15:06:02.220Z (3 months ago)
Language: Python
Size: 3.91 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Transcriber

Transcriber is a simple Python script that converts audio files (MP3, WAV, WEBM, or MP4) to text using the OpenAI Whisper model. It supports multiple languages and output formats.

## Features

- Transcribe audio files in MP3, WAV, WEBM, or MP4 format.
- Supports multiple Whisper models: tiny, base, small, medium, large.
- Specify the language of the audio for accurate transcription.
- Output the transcription in different formats: TXT, SRT, JSON.
- Verbose mode for detailed transcription process.

## Requirements

- Python 3.7 or higher
- pydub
- ffmpeg
- whisper

## Installation

1. Clone the repository:
```sh
git clone https://github.com/erseco/transcriber.git
cd transcriber
```

2. Create and activate a virtual environment (optional but recommended):
```sh
python3 -m venv env
source env/bin/activate # On Windows use `env\Scripts\activate`
```

3. Install the required packages:
```sh
pip install -r requirements.txt
```

4. Make sure you have `ffmpeg` installed. You can download it from [here](https://ffmpeg.org/download.html) and follow the installation instructions for your operating system.

## Usage

```sh
python transcriber.py [--language ] [--model ] [--output_format ] [--verbose]
```

### Arguments

- ``: Path to the MP3, WAV, WEBM, or MP4 file to transcribe.
- `--language`: Language of the audio for transcription (default: `es` for Spanish).
- `--model`: Whisper model to use for transcription (default: `medium`). Options: `tiny`, `base`, `small`, `medium`, `large`.
- `--output_format`: Output format for the transcription (default: `txt`). Options: `txt`, `srt`, `json`.
- `--verbose`: Enable verbose output during transcription.

### Examples

Transcribe an MP3 file to a text file:
```sh
python transcriber.py ~/Downloads/audio.mp3 --language en --model small --output_format txt
```

Transcribe a WEBM file to a JSON file with verbose output:
```sh
python transcriber.py ~/Downloads/audio.webm --language fr --model large --output_format json --verbose
```

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License.