Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cyrta/voxceleb
mirror of VoxCeleb dataset - a large-scale speaker identification dataset
https://github.com/cyrta/voxceleb
corpus dataset speaker speaker-identification speaker-recognition speaker-verification speech
Last synced: 3 days ago
JSON representation
mirror of VoxCeleb dataset - a large-scale speaker identification dataset
- Host: GitHub
- URL: https://github.com/cyrta/voxceleb
- Owner: cyrta
- License: other
- Created: 2018-01-16T12:42:58.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-07-05T01:52:14.000Z (over 5 years ago)
- Last Synced: 2024-08-03T19:09:33.177Z (4 months ago)
- Topics: corpus, dataset, speaker, speaker-identification, speaker-recognition, speaker-verification, speech
- Language: Shell
- Size: 10 MB
- Stars: 65
- Watchers: 4
- Forks: 18
- Open Issues: 5
-
Metadata Files:
- Readme: README.rst
- License: LICENSE.txt
Awesome Lists containing this project
README
# VoxCeleb
mirror of [VoxCeleb dataset - a large-scale speaker identification dataset](http://www.robots.ox.ac.uk/~vgg/data/voxceleb/)THIS IS WORK IN PROGRESS.
I would like to have a reproducable way do download mp3 from youtube, trim it and store as delivered by the author of the dataset----
This repo contains the download links to the VoxCeleb dataset, described in [1].
VoxCeleb contains over 100,000 utterances for 1,251 celebrities, extracted from videos uploaded to YouTube. The dataset is gender balanced, with 55% of the speakers male. The speakers span a wide range of different ethnicities, accents, professions and ages. There are no overlapping identities between development and test sets.
+-------------------+---------+-------+
| | train | test |
+===================+=========+=======+
| # of speakers | 1,211 | 40 |
+-------------------+---------+-------+
| # of videos | 21,819 | 677 |
+-------------------+---------+-------+
| # of utterances | 139,124 | 6,255 |
+-------------------+---------+-------+Nationality Distribution: The nationalities of the speakers in the dataset were obtained by crawling Wikipedia and can be found here. You can also view the distribution in the following graph:
.. image:: ./data/v1/country.png
The list of duplicates (34 videos only in the train set) can be found [here](./data/v1/duplicates.txt).
The train/val/test split used in [1] below for Speaker Identification can be found [here](./data/v1/Identification_split.txt).
Models:
- Pretrained models from dataset authors for VGGVox - Speaker Identification and Verification [1] can be found [here](https://github.com/a-nagrani/VGGVox).Notice:
> We are preparing an extended dataset (VoxCeleb2), containing up to 4 times as many speakers and videos.
VoxCeleb2 was originally due to be released in Q4 2017, however it has been delayed to Q1 2018 due to resource constraints.-------
Publications:
[1] A. Nagrani, J. S. Chung, A. Zisserman - [VoxCeleb: a large-scale speaker identification dataset](./docs/2017-Nagrani-VoxCeleb_large-scale_speaker_identification_dataset.pdf) - INTERSPEECH, 2017
[2] Yifan He, Zhang Zhang - [Speaker Identication with VoxCeleb DataSet](./docs/2017-YifajHeZhangZhang-Speaker_Identication_with_VoxCeleb_DataSet-stanford_students_raport.pdf) - Stanford students project, 2017