https://github.com/abinashmeher999/voice-data-extract

A command line interface to combine text information from subtitles with voice data in the video. Provides a convenient way to generate training data for speech-recognition purposes.
https://github.com/abinashmeher999/voice-data-extract

speech-recognition speech-to-text training-data

Last synced: 4 months ago
JSON representation

A command line interface to combine text information from subtitles with voice data in the video. Provides a convenient way to generate training data for speech-recognition purposes.

Host: GitHub
URL: https://github.com/abinashmeher999/voice-data-extract
Owner: abinashmeher999
License: mit
Created: 2017-07-19T17:40:50.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2023-10-04T00:58:11.000Z (about 2 years ago)
Last Synced: 2024-12-27T22:54:43.175Z (10 months ago)
Topics: speech-recognition, speech-to-text, training-data
Language: Python
Size: 26.4 KB
Stars: 19
Watchers: 2
Forks: 6
Open Issues: 4
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.rst
- License: LICENSE.txt
- Authors: AUTHORS.rst

Awesome Lists containing this project

README

          voice-data-extract

==================

[![PyPI version](https://badge.fury.io/py/srtvoiceext.svg)](https://badge.fury.io/py/srtvoiceext)

A command line interface to combine text information from subtitles with voice data in the video.

Provides a convenient way to generate training data for speech-recognition purposes.

Description

===========

The project provides a quick way to generate audio training data for speech-recognition machine learning models.

It utilises the vast knowledge bank of annotated voice data we already have, **Subtitles!!**

It reads the subtitles line by line and clips the audio from the video for the corresponding time interval.

example usage:

```bash

$ srt_voice -fv video.mkv -fs subtitles.srt -o output_dir

```

This then follows a series a prompts that allow you to decide to whether to keep or discard an audio clip. Like the one given below

```

I know what you are.

[y: Keep]  [n: Delete]  [r: Repeat]  [q: Quit]

Kept as 5-I_know_what_you_are-f3nKAy.mp3

------------------------------------------

```

It creates the directory `output_dir` and nicely arranges the audio clips there.

**The training text (utf-8 encoded) is kept intact as the `title` attribute of the mp3 file.**

For more usage options:

```bash

$ srt_voice -h

```

------

Setup

=====

You will need these

- [Audacious Music Player](http://audacious-media-player.org/download)

- [Python 3](https://launchpad.net/~fkrull/+archive/ubuntu/deadsnakes) (Optional, but recommended because of some syncing issues in moviepy)

Then:

```bash

$ pip install srtvoiceext

```

------

This has been possible only because of the hard work of the maintainers of packages like

- moviepy

- pysrt

- mutagen

- shortuuid

*This project has been set up using PyScaffold 2.5.7. For details and usage

information on PyScaffold see http://pyscaffold.readthedocs.org/.*

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/abinashmeher999/voice-data-extract

Awesome Lists containing this project

README