An open API service indexing awesome lists of open source software.

https://github.com/thedhruvrawat/y-dat

Video transcription service built as a part of Practice School - I course at William O'Neil, Bangalore
https://github.com/thedhruvrawat/y-dat

google-speech-api speech-recognition youtube-transcript youtube-video-downloader

Last synced: 2 months ago
JSON representation

Video transcription service built as a part of Practice School - I course at William O'Neil, Bangalore

Awesome Lists containing this project

README

        

![banner](https://user-images.githubusercontent.com/59053357/126071167-3c453ea8-a810-4919-a448-41def9b50082.png)

# Y-DAT: Youtube Video Download and Transcription

Allows the user to download Youtube video using it's URL (using `youtube_dl` library) and then generate it's transcript using `SpeechRecognizer` library

### Installation
Make sure you have `Python 3.5 or higher` installed along with `pip`.

#### Cloning the repository
```bash
git clone https://github.com/thedhruvrawat/y-dat.git
```

#### Installing the requirements
To install the requirements, run
```bash
pip install -r requirements.txt
```

### Downloading the video
To download the video from a link, run
```bash
python downloader.py -url
```

### Generating the transcript
To recognize the speech in video, run
```bash
python3 recognizer.py -video
```
By default, the video will be saved by the name `video.mkv`

The generated text will be saved in the file `transcript.txt`, along with this an `audio.wav` file containing solely the audio of the downloaded video will also be generated in the same folder. Works only for English language.

> Best results obtained when the audio is free from any kind of background sounds

### Further Reading
* [Youtube-dl](https://github.com/ytdl-org/youtube-dl)
* [SpeechRecognizer](https://github.com/Uberi/speech_recognition)
* [Librosa](https://github.com/librosa/librosa)
* [Russian-Subtitles-Generator](https://github.com/nestyme/Subtitles-generator)

### License
MIT