{"id":49204546,"url":"https://github.com/qualcomm/voiceai-dataset","last_synced_at":"2026-04-23T17:04:39.074Z","repository":{"id":343507832,"uuid":"1084224439","full_name":"qualcomm/voiceai-dataset","owner":"qualcomm","description":"This project is used to release voice samples dataset we use in voiceai models","archived":false,"fork":false,"pushed_at":"2026-04-02T01:51:21.000Z","size":11,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-02T14:39:06.912Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qualcomm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE-OF-CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-27T11:52:15.000Z","updated_at":"2026-04-02T01:51:26.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/qualcomm/voiceai-dataset","commit_stats":null,"previous_names":["qualcomm/voiceai-dataset"],"tags_count":2,"template":false,"template_full_name":"qualcomm/qualcomm-repository-template","purl":"pkg:github/qualcomm/voiceai-dataset","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualcomm%2Fvoiceai-dataset","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualcomm%2Fvoiceai-dataset/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualcomm%2Fvoiceai-dataset/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualcomm%2Fvoiceai-dataset/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qualcomm","download_url":"https://codeload.github.com/qualcomm/voiceai-dataset/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualcomm%2Fvoiceai-dataset/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32189670,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-23T15:28:30.493Z","status":"ssl_error","status_checked_at":"2026-04-23T15:28:29.972Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-23T17:04:39.014Z","updated_at":"2026-04-23T17:04:39.062Z","avatar_url":"https://github.com/qualcomm.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# VoiceAI dataset\n\nThis project provides a collection of datasets that are used in the VoiceAI notebooks from QPM (Qualcomm Package Manager) releases. You can download the datasets from this repository to eliminate the need to download the full large dataset.\n\n## Dataset list\n\n| Dataset           |  Description    | Download      |\n| :---------------- | :---------------|:-------------:|\n| Common Voice for Whisper notebook   | a small portion of [Common Voice ](https://commonvoice.mozilla.org/en/datasets) V9 English dataset  | [Link](https://github.com/qualcomm/voiceai-dataset/releases/download/whisper_dataset/common_voice_9.0_for_whisper_notebook.zip) |\n| LibriSpeech for Whisper notebook    | a small portion of `train-clean-100` and `train-other-500` \u003cbr\u003e datasets from [LibriSpeech](https://www.openslr.org/12) | [Link](https://github.com/qualcomm/voiceai-dataset/releases/download/whisper_dataset/LibriSpeech_for_whisper_notebook.zip) |\n| Common Voice for Zipformer notebook | a small portion of [Common Voice](https://commonvoice.mozilla.org/en/datasets) V9 English and Chinese datasets | [Link](https://github.com/qualcomm/voiceai-dataset/releases/download/zipformer_dataset/common_voice_9.0_for_zipformer_notebook.zip) |\n\n## Usage\n\nDownload the datasets from the releases page and follow the notebook of the VoiceAI model you are using for further instructions.\n\n## Getting in Contact\n\n* [Report an Issue on GitHub](../../issues)\n\n## License\n\nThe project is licensed under the [BSD-3-clause License](https://spdx.org/licenses/BSD-3-Clause.html). See [LICENSE.txt](LICENSE.txt) for the full license text.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqualcomm%2Fvoiceai-dataset","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqualcomm%2Fvoiceai-dataset","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqualcomm%2Fvoiceai-dataset/lists"}