https://github.com/echohello-dev/transcribe-yt
Tool used to capture video transcriptions from yt
https://github.com/echohello-dev/transcribe-yt
Last synced: 12 months ago
JSON representation
Tool used to capture video transcriptions from yt
- Host: GitHub
- URL: https://github.com/echohello-dev/transcribe-yt
- Owner: echohello-dev
- Created: 2024-09-15T01:31:34.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-15T05:52:15.000Z (over 1 year ago)
- Last Synced: 2024-09-16T08:29:34.673Z (over 1 year ago)
- Language: TypeScript
- Size: 54.7 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Transcribe-YT
Transcribe-YT is a Node.js application that downloads YouTube videos, extracts their audio, and transcribes them using OpenAI's Whisper API or AssemblyAI.
## Prerequisites
- Node.js (v18.19.1 or later recommended)
- Yarn v4
- FFmpeg installed on your system
- An OpenAI API key
- An AssemblyAI API key
## Installation
1. Clone this repository:
```
git clone https://github.com/yourusername/transcribe-yt.git
cd transcribe-yt
```
2. Install dependencies using Yarn v4:
```
yarn install
```
3. Copy the example configuration file and edit it with your YouTube video URLs:
```
cp config.example.yaml config.yaml
```
Then edit `config.yaml` to include the YouTube video URLs you want to transcribe.
4. Create a `.env` file in the root directory and add your API keys:
```
OPENAI_API_KEY=your_openai_api_key_here
ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here
```
## Usage
1. Ensure your `config.yaml` file contains the YouTube video URLs you want to transcribe and specifies the transcription service to use.
2. Run the application:
```
yarn start
```
The script will:
- Download the audio from each YouTube video
- Convert the audio to MP3 format
- Transcribe the audio using either OpenAI's Whisper API or AssemblyAI (as specified in config.yaml)
- Save the transcriptions in the `transcripts` directory
- Clean up the temporary audio files
## Configuration
- `config.yaml`: List the YouTube video URLs you want to transcribe and specify the transcription service to use.
- `.env`: Store your OpenAI and AssemblyAI API keys.
## Project Structure
- `processVideos.ts`: Main script that handles video processing and transcription.
- `package.json`: Defines project dependencies and scripts.
- `config.yaml`: Contains the list of YouTube video URLs to process.
- `transcripts/`: Directory where transcriptions are saved.
- `audio/`: Temporary directory for audio files (cleaned up after processing).
## Dependencies
Key dependencies include:
- `@distube/ytdl-core`: For downloading YouTube videos
- `fluent-ffmpeg`: For audio processing
- `openai`: For interacting with the OpenAI API
- `assemblyai`: For interacting with the AssemblyAI API
- `js-yaml`: For parsing the YAML configuration file
- `tsx`: For running TypeScript files directly
For a full list of dependencies, refer to `package.json`.
## License
This project is licensed under the UNLICENSED license.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## Disclaimer
This tool is for educational and personal use only. Ensure you have the right to download and transcribe the YouTube content you're processing.