https://github.com/bluebirdback/vtt-to-text
A Python script to convert WebVTT subtitle files (.vtt) to plain text transcripts. Supports single file and batch directory processing.
https://github.com/bluebirdback/vtt-to-text
Last synced: 11 months ago
JSON representation
A Python script to convert WebVTT subtitle files (.vtt) to plain text transcripts. Supports single file and batch directory processing.
- Host: GitHub
- URL: https://github.com/bluebirdback/vtt-to-text
- Owner: BlueBirdBack
- License: mit
- Created: 2024-12-17T09:29:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-05T11:48:59.000Z (over 1 year ago)
- Last Synced: 2025-05-20T18:58:24.481Z (about 1 year ago)
- Language: Python
- Size: 4.88 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# vtt-to-text
A Python script to convert WebVTT subtitle files (.vtt) to plain text transcripts. Supports single file and batch directory processing.
## Features
- Convert single VTT files to plain text transcripts
- Batch process entire directories of VTT files
- Automatically removes HTML tags and timestamps
- Eliminates duplicate consecutive lines
- UTF-8 encoding support
## Usage
### Single File Conversion
```bash
python vtt_to_text.py input.vtt
```
This will create a text file with the same name as the input file (e.g., `input.txt`).
### Batch Directory Processing
```bash
python vtt_to_text.py -i /path/to/input/directory -o /path/to/output/directory
```
This will convert all .vtt files in the input directory and save the transcripts to the output directory.
## Command Line Arguments
- `input`: Path to a single input .vtt file (optional if -i is used)
- `-i, --input-path`: Path to the input directory containing multiple .vtt files
- `-o, --output-path`: Path to the output directory for transcript files (required when using -i)
## Example
```bash
# Convert a single file
python vtt_to_text.py subtitles.vtt
# Convert all VTT files in a directory
python vtt_to_text.py -i ./subtitles -o ./transcripts
```
## Error Handling
The script includes error handling for:
- File not found
- Directory not found
- Read/write permission issues
- Invalid file formats