https://github.com/bluebirdback/deepgram_caption_generator

Use Deepgram's API to transcribe audio to text & generate captions in WebVTT or SubRip format.
https://github.com/bluebirdback/deepgram_caption_generator

Last synced: over 1 year ago
JSON representation

Use Deepgram's API to transcribe audio to text & generate captions in WebVTT or SubRip format.

Host: GitHub
URL: https://github.com/bluebirdback/deepgram_caption_generator
Owner: BlueBirdBack
License: mit
Created: 2024-03-22T07:33:34.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-03-22T07:45:37.000Z (about 2 years ago)
Last Synced: 2025-01-13T14:52:33.436Z (over 1 year ago)
Language: Python
Size: 4.88 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Deepgram Caption Generator

This Python script utilizes Deepgram's advanced speech recognition API to transcribe audio files and generate captions in WebVTT (`.vtt`) or SubRip (`.srt`) formats. It supports multiple Deepgram transcription models, providing flexibility to achieve the best accuracy for various audio types and use cases. This tool is perfect for content creators, podcasters, and anyone looking to enhance their audio content with accurate and easily integrated captions.

## Features

- Transcribe audio files using Deepgram's state-of-the-art speech-to-text API.
- Generate captions in either WebVTT or SubRip format.
- Choose from a variety of Deepgram transcription models.
- Output transcription in `.vtt`, `.srt`, or plain text format.

## Prerequisites

- Python 3.10 or later
- A Deepgram account and API key. Sign up [here](https://console.deepgram.com/signup) to obtain your API key.

## Installation

1. Clone this repository or download the files to your local machine.
2. Install the required dependencies:

```bash
pip install -r requirements.txt
```

3. Create a `.env` file in the project root and add your Deepgram API key like so:

```
DEEPGRAM_API_KEY='your_deepgram_api_key_here'
```

## Usage

Navigate to the `src/` directory and run the script from the command line, specifying the audio file path and optional parameters for the transcription model and output format.

```bash
python transcribe.py path_to_your_audio_file -m model_name -f output_format
```

- `path_to_your_audio_file`: The path to the audio file you want to transcribe.
- `model_name` (optional): The Deepgram model to use for transcription. Defaults to `nova-2`.
- `output_format` (optional): The format of the transcription output. Can be `vtt`, `srt`, or `text`. Defaults to `vtt`.

Example:

```bash
python src/transcribe.py ./audio/sample.mp3
```

This command will transcribe the `sample.mp3` file using the `nova-2` model and save the captions in WebVTT format.

## Supported Audio Formats

Deepgram supports a wide range of audio formats. For optimal results, use audio files in MP3, WAV, or FLAC format.

## Note

Ensure your audio file is accessible and the path is correct. Network issues or incorrect API keys may lead to transcription failures.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bluebirdback/deepgram_caption_generator

Awesome Lists containing this project

README