https://github.com/nnnnicholas/transcribe-audiio
https://github.com/nnnnicholas/transcribe-audiio
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/nnnnicholas/transcribe-audiio
- Owner: nnnnicholas
- Created: 2024-06-26T22:42:15.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-06-26T22:43:17.000Z (11 months ago)
- Last Synced: 2025-02-05T12:27:10.502Z (4 months ago)
- Language: Python
- Size: 2.93 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Audio Transcription Script
This Python script automates the process of transcribing MP3 audio files to text using OpenAI's Whisper model. It's designed to efficiently handle large audio files by splitting them into smaller chunks and processing them in parallel.
## Features
- Accepts MP3 files as input
- Splits audio into manageable chunks for efficient processing
- Utilizes parallel processing for faster transcription
- Outputs transcription to a text file
- Handles errors gracefully, continuing even if individual chunks fail## Prerequisites
Before you begin, ensure you have met the following requirements:
- Python 3.7 or higher
- FFmpeg installed on your system## Installation
1. Clone this repository:
```
git clone https://github.com/your-username/audio-transcription-script.git
cd audio-transcription-script
```2. Install the required Python packages:
```
pip install whisper pydub
```## Usage
Run the script from the command line, providing the path to your MP3 file:
```
python transcribe_audio.py path/to/your/audio.mp3
```By default, this will create a file named `transcription_output.txt` in the same directory.
To specify a custom output file:
```
python transcribe_audio.py path/to/your/audio.mp3 -o path/to/output.txt
```## Options
- `-o, --output`: Specify the path for the output text file (default: `transcription_output.txt`)
## Performance
This script is optimized for multi-core processors and should perform well on systems like M1 Macs. The audio is split into 1-minute chunks by default, which are processed in parallel.
## Limitations
- Currently only supports MP3 input files
- Transcription accuracy depends on the Whisper model used (default is "base")## Contributing
Contributions to improve the script are welcome. Please feel free to submit a Pull Request.
## License
This project is open source and available under the [MIT License](LICENSE).
## Acknowledgments
- This script uses OpenAI's Whisper model for transcription
- Audio processing is handled by the pydub library