Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/llimllib/yt-transcribe
Transcribe a youtube video into an easily readable HTML file
https://github.com/llimllib/yt-transcribe
Last synced: about 2 months ago
JSON representation
Transcribe a youtube video into an easily readable HTML file
- Host: GitHub
- URL: https://github.com/llimllib/yt-transcribe
- Owner: llimllib
- License: unlicense
- Created: 2024-09-07T00:31:03.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-09-22T18:58:01.000Z (4 months ago)
- Last Synced: 2024-11-30T02:50:57.290Z (about 2 months ago)
- Language: Shell
- Homepage:
- Size: 421 KB
- Stars: 12
- Watchers: 2
- Forks: 1
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# yt-transcribe
**Transcribe a youtube video into an easily readable HTML file**
I've put up a [sample transcription here](https://llimllib.github.io/yt-transcribe/definedefine/definedefine.html) if you want to see what the output looks like with thumbnails, and a [sample without thumbnails here](https://llimllib.github.io/yt-transcribe/cumberbatch/) if you want to see what that looks like.
## installation
Right now, I think this only works on a mac. To install all the dependencies, run:
`brew install ffmpeg jq python yt-dlp && pip install mlx_whisper`
Once you have the dependencies installed, copy `yt-transcribe` anywhere on your path. I recommend `/usr/local/bin`. Then make sure it's executable with something like `chmod a+x /usr/local/bin/yt-transcribe`.
If you would like to use this on a non-mac computer, [let me know](https://hachyderm.io/@llimllib/); it could be made to work with [whisper.cpp](https://github.com/ggerganov/whisper.cpp) fairly easily.
**Please report any issues you find!**
## usage
```
Usage: yt-transcribe [options]Transcribe a youtube video into an easily readable HTML file
OPTIONS
-help: print this message
-outdir: the directory to put the output files in. [default /tmp/yttranscribe_cache]
-outfile: the name of the output HTML file
-thumbs: enable thumbnail generation
-thumbinterval: the interval between thumbnails, in seconds [default 30]
-v: print more verbose outputDEPENDENCIES
Assumes you have installed:
- ffmpeg
- jq
- mlx_whisper
- python
- yt-dlpTo install all on a mac:
`brew install ffmpeg jq python yt-dlp && pip install mlx_whisper`
EXAMPLES
Transcribe a youtube video:
yt-transcribe 'https://www.youtube.com/watch?v=vP4iY1TtS3s'
Transcribe a video and insert thumbnails every 30 seconds (the default):
yt-transcribe -thumbs 'https://www.youtube.com/watch?v=Ac7G7xOG2Ag'
Transcribe a video and insert thumbnails every 10 seconds:
yt-transcribe -thumbs -thumbinterval 10 'https://www.youtube.com/watch?v=X48G7Y0VWW4'
Transcribe a video to the 'look-around-you' directory, with a filename 'water.html':
yt-transcribe -thumbs -outdir ./look-around-you -outfile water.html 'https://www.youtube.com/watch?v=gaI6kBVyu00'
source: https://github.com/llimllib/yt-transcribe
```# why mlx_whisper instead of whisper.cpp?
Because it's [a lot faster on my machine](https://notes.billmill.org/link_blog/2024/08/mlx-whisper.html)