https://github.com/heyfoz/python-youtube-transcription
This repository contains Python scripts and a local Flask web application for transcribing YouTube videos using various methods. It includes functionalities to retrieve video transcripts using the YouTube Data API, download audio from YouTube videos, and convert audio to text using speech recognition.
https://github.com/heyfoz/python-youtube-transcription
accessibility api audio captions python speech-recognition speech-to-text subtitles webvtt youtube youtube-audio youtube-audio-downloader
Last synced: 3 months ago
JSON representation
This repository contains Python scripts and a local Flask web application for transcribing YouTube videos using various methods. It includes functionalities to retrieve video transcripts using the YouTube Data API, download audio from YouTube videos, and convert audio to text using speech recognition.
- Host: GitHub
- URL: https://github.com/heyfoz/python-youtube-transcription
- Owner: heyfoz
- License: mit
- Created: 2024-05-24T03:48:20.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-29T02:45:19.000Z (about 1 year ago)
- Last Synced: 2025-04-06T02:25:19.460Z (6 months ago)
- Topics: accessibility, api, audio, captions, python, speech-recognition, speech-to-text, subtitles, webvtt, youtube, youtube-audio, youtube-audio-downloader
- Language: Python
- Homepage:
- Size: 31.3 KB
- Stars: 4
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Python YouTube Transcription
This repository contains Python scripts and a local Flask web application for transcribing YouTube videos using various methods. It includes functionalities to retrieve video transcripts using the YouTube Data API, download audio from YouTube videos, and convert audio to text using speech recognition.
## Features
- **get_youtube_captions.py**: Contains functions to retrieve YouTube video caption transcripts using the YouTube Data API.
- **youtube_speech_recognition.py**: Provides functionality to download audio from YouTube videos for transcription.
- **download_youtube_audio.py**: Script to download YouTube audio and save it with a timestamp-based filename.
- **index.html**: HTML template for the Flask web application UI to transcribe YouTube videos.## Installation
1. Clone the repository:
```bash
git clone https://github.com/heyfoz/python-youtube-transcription.git
```2. Install the required libraries:
```bash
pip install Flask google-api-python-client pytube pydub SpeechRecognition
```## Usage
1. Ensure you have set up your Google API key and environment variable as specified in `get_youtube_captions.py`.
2. Run the desired Flask application (`get_youtube_captions.py`, or `youtube_speech_recognition.py`).
3. Open your web browser and navigate to `http://localhost:5000` to access the web application.
4. Enter a YouTube video URL and choose the desired transcription option.
5. Optionally, use `get_youtube_captions.py` to download a .wav file to the project directory.## Documentation
- [YouTube Data API Documentation](https://developers.google.com/youtube/v3/docs)
- [pytube Documentation](https://pytube.io/en/latest/)
- [pydub Documentation](https://github.com/jiaaro/pydub#documentation)
- [SpeechRecognition Documentation](https://github.com/Uberi/speech_recognition#readme)## Note
- The `index.html` file can be used with any of the Flask scripts mentioned above.
- There are some limitations on YouTube regarding audio availability based on intellectual property concerns and user configurations. Some videos may not be accessible for transcription.## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.