{"id":15036910,"url":"https://github.com/apple/ml-spatial-librispeech","last_synced_at":"2025-10-19T22:32:34.475Z","repository":{"id":189247322,"uuid":"680306865","full_name":"apple/ml-spatial-librispeech","owner":"apple","description":"A large synthetic dataset of spatial audio with multiple labels","archived":false,"fork":false,"pushed_at":"2023-10-25T15:49:18.000Z","size":12,"stargazers_count":96,"open_issues_count":0,"forks_count":8,"subscribers_count":17,"default_branch":"main","last_synced_at":"2025-01-30T07:33:11.246Z","etag":null,"topics":["machine-learning","spatial-audio"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apple.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-08-18T21:29:08.000Z","updated_at":"2025-01-21T01:46:24.000Z","dependencies_parsed_at":"2023-08-19T00:28:32.321Z","dependency_job_id":"7b835386-54c5-4db4-a9f8-dc189016f13c","html_url":"https://github.com/apple/ml-spatial-librispeech","commit_stats":null,"previous_names":["apple/ml-spatial-librispeech"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apple%2Fml-spatial-librispeech","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apple%2Fml-spatial-librispeech/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apple%2Fml-spatial-librispeech/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apple%2Fml-spatial-librispeech/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apple","download_url":"https://codeload.github.com/apple/ml-spatial-librispeech/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237224861,"owners_count":19275098,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","spatial-audio"],"created_at":"2024-09-24T20:32:44.275Z","updated_at":"2025-10-19T22:32:29.179Z","avatar_url":"https://github.com/apple.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spatial LibriSpeech\n\nSpatial LibriSpeech, is a spatial audio dataset with over 650 hours of first-order\nambisonics, and optional distractor noise (with raw 19-channel audio coming soon). Spatial LibriSpeech is designed  for machine learning\nmodel training, and it includes labels for source position, speaking direction, room acoustics and\ngeometry. Spatial LibriSpeech was generated by augmenting LibriSpeech samples with 200k+ simulated\nacoustic conditions across 8k+ synthetic rooms.\n\nFor more information, refer to our paper: https://doi.org/10.21437/Interspeech.2023-2117.\n\nIf you use Spatial LibriSpeech in a publication, please cite our paper:\n```\n@inproceedings{spatial_librispeech2023,\n  author={Miguel Sarabia and Elena Menyaylenko and Alessandro Toso and Skyler Seto\n          and Zakaria Aldeneh and Shadi Pirhosseinloo and Luca Zappella\n          and Barry-John Theobald and Nicholas Apostoloff and Jonathan Sheaffer},\n  title={{Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning}},\n  year={2023},\n  booktitle={Proc. Interspeech},\n  pages={3724--3728},\n  doi={10.21437/Interspeech.2023-2117}\n}\n```\n\n## 📜 License\n\nBy downloading and using Spatial LibriSpeech, you are agreeing to comply with\nthe terms of its [LICENSE](LICENSE).\n\n## 💾 Download\nOur downloader script \u0026 pytorch dataloader will be uploaded soon.\n\n### Manual download\nIn the meantime, all our files are hosted here:\n```python3\nSLS_URI = \"https://docs-assets.developer.apple.com/ml-research/datasets/spatial-librispeech/v1\"\n```\nYou can manually download the metadata from here. Refer to [dataset schema](DATASET_SCHEMA.md)\nfor more information about how the data is structured.\n```python3\nf\"{SLS_URI}/metadata.parquet\"\n```\nUsing the metadata you can manually download samples with:\n```python3\n# speech first order ambisonics samples\nf\"{SLS_URI}/ambisonics/{sample_id:06}.flac\"\n# distractor noise first order ambisonics samples\nf\"{SLS_URI}/noise_ambisonics/{sample_id:06}.flac\"\n```\n\nSo, for instance, you may download the metadata with this command:\n```bash\ncurl -O https://docs-assets.developer.apple.com/ml-research/datasets/spatial-librispeech/v1/metadata.parquet\n```\nAnd the first speech sample with:\n```bash\ncurl -O https://docs-assets.developer.apple.com/ml-research/datasets/spatial-librispeech/v1/ambisonics/000000.flac\n```\n\n⚠️ 19-channel speech and distractor noise samples are very large and we are evaluating how to best host them. If\nyou need them in the meantime, please contact us.\n\n## ✉️ Contact\n\n*  [spatial-librispeech-dataset@group.apple.com](mailto:spatial-librispeech-dataset@group.apple.com)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapple%2Fml-spatial-librispeech","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapple%2Fml-spatial-librispeech","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapple%2Fml-spatial-librispeech/lists"}