An open API service indexing awesome lists of open source software.

https://github.com/robmsmt/ASR-Audio-Data-Links

A list of publically available audio data that anyone can download for ASR or other speech activities
https://github.com/robmsmt/ASR-Audio-Data-Links

asr audio-data data speech speech-activities speech-recognition speech-to-text

Last synced: 11 months ago
JSON representation

A list of publically available audio data that anyone can download for ASR or other speech activities

Awesome Lists containing this project

README

          

# Audio Data Links

A list of common publically (and privately) available audio data that you can download for ASR or other speech activities. All your WERs are belong to us. Inspired by [wer are we](https://github.com/syhw/wer_are_we) who stole someone elses joke.

## 1. FREE

**Source**|**Name & Direct Link**|**Type**|**Size(Hours)**
:-----:|:-----:|:-----:|:-----:
[OpenSLR](http://www.openslr.org/12)|LibriSpeech - Train:[100](http://www.openslr.org/resources/12/train-clean-100.tar.gz) [360](http://www.openslr.org/resources/12/train-clean-360.tar.gz) [500](http://www.openslr.org/resources/12/train-other-500.tar.gz)
Test:[Clean](http://www.openslr.org/resources/12/test-clean.tar.gz) [Other](http://www.openslr.org/resources/12/test-other.tar.gz) Dev:[Clean](http://www.openslr.org/resources/12/dev-clean.tar.gz) [Other](http://www.openslr.org/resources/12/dev-other.tar.gz)|Read|960
[OpenSLR](http://www.openslr.org/19)|[TED-LIUM Release 2](http://www.openslr.org/resources/19/TEDLIUM_release2.tar.gz)|Read|118
[OpenSLR](https://www.openslr.org/51/)|[TED-LIUM Release 3](http://www.openslr.org/resources/51/TEDLIUM_release-3.tgz)|Read|452
[Voxforge](http://www.voxforge.org/home/downloads)|[Voxforge English](https://common-voice-data-download.s3.amazonaws.com/voxforge_corpus_v1.0.0.tar.gz)|Read|130
[Mozilla](https://voice.mozilla.org)|[Common Voice v1](https://common-voice-data-download.s3.amazonaws.com/cv_corpus_v1.tar.gz)|Read|500
[Mozilla](https://voice.mozilla.org)|[Common Voice en_1087h_2019-06-12](https://voice-prod-bundler-ee1969a6ce8178826482b88e843c335139bd3fb4.s3.amazonaws.com/cv-corpus-3/en.tar.gz)|Read|1,087
[Tatoeba](http://tatoeba.org)|[Tatoeba Audio Eng](https://downloads.tatoeba.org/audio/tatoeba_audio_eng.zip)|Read|~200
[Valentini](https://datashare.is.ed.ac.uk/handle/10283/2791)|Noisy Speech Database [All Files](http://datashare.is.ed.ac.uk/download/DS_10283_2791.zip), [DOI](https://doi.org/10.7488/ds/2117) |Read|TBC
[VOiCES](https://iqtlabs.github.io/voices/)|Complex Environmental Settings [All Files](https://raw.githubusercontent.com/robmsmt/ASR-Audio-Data-Links/master/VOiCES_download.sh) |Read
LibriSpeech|15
[ai4bharat](https://ai4bharat.org)|[NPTEL2020](https://github.com/AI4Bharat/NPTEL2020-Indian-English-Speech-Dataset)
en-IN [Torrent](https://academictorrents.com/download/cc9dc56afd3055c7e0f021ec4f1824021558926c.torrent)|Lectures|15,700
[Opencollective](https://opencollective.com/open_stt)|[open_stt](https://github.com/snakers4/open_stt/)
Russian [Torrent](https://academictorrents.com/download/95b4cab0f99850e119114c8b6df00193ab5fa34f.torrent)|Various Read/Presented|20,108
[Speechcolab](https://arxiv.org/abs/2106.06909)|[GigaSpeech](https://github.com/SpeechColab/GigaSpeech)
[Link](https://github.com/SpeechColab/GigaSpeech#download)|Various Read/Presented|33,000 Unlabeled
10,000 Labeled

## 2. PAID

**Source**|**Name**|**Type**|**Size(Hours)**|**Code**
:-----:|:-----:|:-----:|:-----:|:-----:
[LDC](https://www.ldc.upenn.edu)|Fisher|Conversational|2000|Speech [LDC2004S13](https://catalog.ldc.upenn.edu/LDC2004S13) [LDC2005S13](https://catalog.ldc.upenn.edu/LDC2005S13)
Transcripts [LDC2004T19](https://catalog.ldc.upenn.edu/LDC2004T19) [LDC2005T19](https://catalog.ldc.upenn.edu/LDC2005T19)
[LDC](https://www.ldc.upenn.edu)|Switchboard Hub 500|Conversational|240|[LDC2002S09](https://catalog.ldc.upenn.edu/LDC2002S09)
[LDC](https://www.ldc.upenn.edu)|Switchboard Release 2|Conversational|300|[LDC97S62](https://catalog.ldc.upenn.edu/LDC97S62)
[LDC](https://www.ldc.upenn.edu)|TIMIT|Read|5|[LDC93S1](https://catalog.ldc.upenn.edu/LDC93S1)
[LDC](https://www.ldc.upenn.edu)|Wall Street Journal (WSJ)|Read|80|[LDC93S6A](https://catalog.ldc.upenn.edu/LDC93S6A) or [LDC93S6B](https://catalog.ldc.upenn.edu/LDC93S6B)

# TTS

## 1. FREE

**Source**|**Name & Direct Link**|**Type**|**Size(Hours)**
:-----:|:-----:|:-----:|:-----:
[Edinburgh CSTR](https://datashare.is.ed.ac.uk/handle/10283/2651)|[CSTR VCTK Corpus](https://datashare.is.ed.ac.uk/bitstream/handle/10283/2651/VCTK-Corpus.zip?sequence=2&isAllowed=y)|Read|44
[LJ Speech](https://keithito.com/LJ-Speech-Dataset/)|[LJ Speech](http://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2)|Read|24