Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/heaversm/srt-to-json-converter
https://github.com/heaversm/srt-to-json-converter
Last synced: 3 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/heaversm/srt-to-json-converter
- Owner: heaversm
- Created: 2024-01-12T21:43:52.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-10-21T20:26:10.000Z (2 months ago)
- Last Synced: 2024-12-16T01:39:05.648Z (7 days ago)
- Language: Python
- Size: 521 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
---
title: Transcript Converter (SRT to JSON)
emoji: 🏃
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 4.12.0
app_file: app.py
pinned: false
license: apache-2.0
---# Transcript to JSON Converter
![Transcript to JSON Conversion](labelstudio-readme-graphic.jpg)
### What is it?
This app turns .srt (Subrip Text) format transcript files into JSON. [SRT](https://en.wikipedia.org/wiki/SubRip) is one of the formats supported by the [Whisper API](https://github.com/openai/whisper), which is OpenAI's open-source Speech to Text library.
### Why is it useful?
* Allows you to easily import transcripts into web apps
* Allows you to convert your transcript to a data format suitable for ML model training and fine tuning.### Try it
[Use the app on Hugging Face Spaces](https://huggingface.co/spaces/mikemoz/srt_to_json_converter)
### Modify It
Your JSON structure will no doubt be different than what I used. To get it into your desired JSON format:
* Clone this repo
* Do what you want with app, but if all you need is to modify the JSON output, focus on this line:```json
transcripts.append({'podcast_name': podcast_name, 'podcast_episode': podcast_episode, 'line_id': id, 'timestamp_start': timestamp_start, 'timestamp_end': timestamp_end, 'content': transcript})
```* Change that JSON structure 🔼 so that the object keys are whatever you need them to be.
### Run it Locally
```bash
python app.py
```### Learn how to use it in ML model training
[Watch this video](https://youtu.be/5HWeVZXQ-E4), where I discuss why you might want to fine tune a model (vs using RAG), the steps in model training, preparing data for annotation, and how to use Label Studio (open source) for that annotation.