{"id":17988716,"url":"https://github.com/joonson/voxconverse","last_synced_at":"2025-04-04T03:25:49.666Z","repository":{"id":47509650,"uuid":"279545997","full_name":"joonson/voxconverse","owner":"joonson","description":"Spot the conversation: speaker diarisation in the wild","archived":false,"fork":false,"pushed_at":"2022-07-26T18:48:44.000Z","size":312,"stargazers_count":133,"open_issues_count":1,"forks_count":15,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-02-09T15:13:41.205Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/joonson.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-07-14T09:48:57.000Z","updated_at":"2025-02-05T14:31:27.000Z","dependencies_parsed_at":"2022-08-25T16:21:54.200Z","dependency_job_id":null,"html_url":"https://github.com/joonson/voxconverse","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joonson%2Fvoxconverse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joonson%2Fvoxconverse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joonson%2Fvoxconverse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/joonson%2Fvoxconverse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/joonson","download_url":"https://codeload.github.com/joonson/voxconverse/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247115070,"owners_count":20886072,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-29T19:12:34.756Z","updated_at":"2025-04-04T03:25:49.646Z","avatar_url":"https://github.com/joonson.png","language":null,"funding_links":[],"categories":["Datasets"],"sub_categories":["Diarization datasets"],"readme":"## VoxConverse speaker diarisation dataset\n\nVoxConverse is an audio-visual diarisation dataset consisting of multispeaker clips of human speech, extracted from YouTube videos.\nUpdates and additional information about the dataset can be found at our [website](http://www.robots.ox.ac.uk/~vgg/data/voxconverse/index.html).\n\n\n### Version 0.3\nWe have recently detected an error in some of our test rttm files. They are fixed in this master branch. Please use the 0.3 version for more accurate labels.\n\n### Version 0.2\nIf you want to see the previous version, please go to the ver0.2 branch in this repository.\n\n#### Audio files\n\nDev set audio files can be downloaded from [here](https://www.robots.ox.ac.uk/~vgg/data/voxconverse/data/voxconverse_dev_wav.zip). \nTest set audio files can be downloaded from [here](https://www.robots.ox.ac.uk/~vgg/data/voxconverse/data/voxconverse_test_wav.zip)\n\n#### Speaker Diarisation annotations \n\nAnnotations are provided as Rich Transcription Time Marked (RTTM) files and can be found in the ```dev```  and ```test``` folder. \n\n#### Citation\n\nPlease cite the following if you make use of the dataset.\n\n```\n@article{chung2020spot,\n  title={Spot the conversation: speaker diarisation in the wild},\n  author={Chung, Joon Son and Huh, Jaesung and Nagrani, Arsha and Afouras, Triantafyllos and Zisserman, Andrew},\n  booktitle={Interspeech},\n  year={2020}\n}\n```\n\n#### License\n\nThe VoxConverse dataset is available to download for research purposes under a [Creative Commons Attribution 4.0 International License](https://creativecommons.org/licenses/by/4.0). The copyright remains with the original owners of the video. \n\nIn order to obtain videos with a large amount of overlapping speech, we used data consisting of political debates and news segments. The views and opinions expressed by speakers in the dataset are those of the individual speakers and do not necessarily reflect positions of the University of Oxford, Naver Corporation, or the authors of the paper.\n\nWe would also like to note that the distribution of identities in this dataset may not be representative the global human population. Please be careful of unintended societal, gender, racial, linguistic and other biases when training or deploying models trained on this data.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoonson%2Fvoxconverse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjoonson%2Fvoxconverse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjoonson%2Fvoxconverse/lists"}