Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dagshub/audio-datasets

open-source audio datasets
https://github.com/dagshub/audio-datasets

audio audio-datasets codepeak codepeak2022 hacktoberfest hacktoberfest-2022 hacktoberfest-2023 hacktoberfest-22 hacktoberfest2022 hacktoberfest22 open-source

Last synced: about 1 month ago
JSON representation

open-source audio datasets

Awesome Lists containing this project

README

        

# Open-source Audio Datasets

![banner](https://user-images.githubusercontent.com/66431403/193427111-af11f270-bce0-4ad8-b0f9-02526312a9c2.png)

## What is DagsHub?

[DagsHub](https://dagshub.com/) is a centralized platform to host and manage machine learning projects including code, data, models, experiments, annotations, model registry, and more! DagsHub does the MLOps heavy lifting for its users. Every repository comes with configured S3 storage, an experiment tracking server, and an annotation workspace - all using popular open-source tools like MLflow, DVC, Git, and Label Studio.

## What is Hacktoberfest?

[Hacktoberfest](https://hacktoberfest.com/) is a month-long virtual festival of open source! Participants are giving back to the community by completing pull requests, participating in events, and donating to open-source projects. This project is part of Hacktoberfest 2023, where participants enrich the open-source audio datasets hosted on DagsHub.

## Quick Start to Contribution
- Sign-up to [Hacktoberfest](https://hacktoberfest.com/auth/) & [DagsHub](https://dagshub.com/user/sign_up?redirect_to=).
- Join our [Hacktoberfest 2022 Discord channel](https://discord.gg/6SsqDCUVeq).
- Read the [contribution guide lines](https://hacktoberfest.com/participation/).
- Create a Pull Requests on the GitHub [audio-datasets](https://github.com/DAGsHub/audio-datasets) repository.

## What does the DagsHub community contribute?
**This year we'd like to focus our contribution on the audio domain**. For that, we added audio data catalog capabilities to DagsHub! You can now upload audio files to DagsHub and see its spectrogram, wave, and even listen to it! You can see a vivid example of this (extremely cool) feature in our [Librispeech-ASR-corpus](https://dagshub.com/DagsHub/Librispeech-ASR-corpus/src/master/dev-clean/84/121123/84-121123-0000.flac) project.

![audio-catalog](assets/audio-catalog.png)

To help audio practitioners leverage this new feature, we want to enrich open-source audio datasets on DagsHub. This is where you can contribute to the data science community!

## How to contribute?
- Claim the dataset you wish to contribute from the [list](https://github.com/jim-schwoebel/voice_datasets/blob/master/README.md) (KUDOS to
[jim-schwoebel](https://github.com/jim-schwoebel)) by opening a new issue on the [GitHub repository](https://github.com/DAGsHub/audio-datasets) and name it after the dataset. Please make sure that the dataset wasn't claimed.
- Open a new DagsHub repository and upload the data to its DVC storage (e.g., [dataset repository](https://dagshub.com/DagsHub/Librispeech-ASR-corpus)).
- Write information about the dataset in the README file (e.g., [Librispeech ASR corpus README](https://dagshub.com/DagsHub/Librispeech-ASR-corpus/src/master/README.md)).
- Add relevant tags to the repository and files.
- Add the following labels to the repository:
- `dataset`
- `audio`
- `hacktoberfest`
- In the GitHub [audio-datasets](https://github.com/DAGsHub/audio-datasets) project:
- Open a new branch named after the dataset.
- Add a directory named after the dataset with the README file.
- Commit and push the changes to GitHub.
- Create a pull request on GitHub.
- Optional: Share the project on DagsHub [Hacktoberfest 2022 Discord channel](https://discord.gg/6SsqDCUVeq).