https://github.com/ktonal/audioset-downloader
cli to download examples of a specific class from google's AudioSet
https://github.com/ktonal/audioset-downloader
audio audioset cli dataset python3
Last synced: 6 months ago
JSON representation
cli to download examples of a specific class from google's AudioSet
- Host: GitHub
- URL: https://github.com/ktonal/audioset-downloader
- Owner: ktonal
- License: mit
- Created: 2022-04-21T16:17:16.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-12-20T20:41:28.000Z (almost 2 years ago)
- Last Synced: 2025-03-26T14:38:49.952Z (7 months ago)
- Topics: audio, audioset, cli, dataset, python3
- Language: Python
- Homepage:
- Size: 68.5 MB
- Stars: 5
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# audioset-downloader
cli for easily building datasets of audio files from google's AudioSet.
**Update may 2023**: metadata has been extended with number of views, likes, comments, availability
## Installation
```bash
pip install audioset-downloader
```note that you'll need to have `ffmpeg` installed on your system.
## Features
- filter by class names (union or intersection)
- filter by set (train [balanced, unbalanced], eval)
- limit number of downloads
- select most viewed / most liked## Usage
```
Usage: audioset-dl [OPTIONS]download examples of a specific class from google's AudioSet
Options:
-o, --output-dir TEXT target directory for the downloads
(default='./')
-c, --class-name TEXT the name of the class to download
(default=Snoring)this option can be repeated
to select examples at the intersection of
multiple classese.g. `-c Music -c
Techno`(list of available classes can be
printed out with the command `audioset-
classes`)
-u, --class-union toggle whether class names should intersect
(default) or not
-m, --mixed if provided, the downloaded examples will be
instances of `--class-name` and possibly some
other classes. Otherwise (default behaviour),
downloaded examples have only `--class-name`
as single label.
-xe, --exclude-eval-set if provided, exclude examples from the eval
set (default=False)
-xb, --exclude-balanced-set if provided, exclude examples from the
balanced set (default=False)
-xu, --exclude-unbalanced-set if provided, exclude examples from the
unbalanced set (default=False)
-n, --n-examples INTEGER number of examples to download (default=all
matching)
-f, --full-source if provided, download full examples instead
of 10 sec. segments (default=False)
-mv, --most-viewed if --n-examples is provided, only the n most
viewed examples will be downloaded
-ml, --most-liked if --n-examples is provided, only the n most
liked examples will be downloaded
--help Show this message and exit.
```you can also print the available classes names with
```bash
audioset-classes
```## References
- AudioSet Homepage:
https://research.google.com/audioset/index.html
- Dataset classes content:
https://research.google.com/audioset/dataset/index.html
## LicenseMIT