https://github.com/materialvision/augment_audio_tools

Simple python scripts to augment a dataset of audio
https://github.com/materialvision/augment_audio_tools

Last synced: 5 months ago
JSON representation

Simple python scripts to augment a dataset of audio

Host: GitHub
URL: https://github.com/materialvision/augment_audio_tools
Owner: materialvision
License: cc0-1.0
Created: 2023-03-26T09:02:17.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-06-07T10:07:00.000Z (almost 2 years ago)
Last Synced: 2024-08-08T00:43:10.261Z (8 months ago)
Language: Python
Size: 17.6 KB
Stars: 7
Watchers: 1
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

project-awesome - materialvision/augment_audio_tools - Simple python scripts to augment a dataset of audio (Python)

README

# Audio Augmentation Tools for Machine Learning

This script provides a set of audio augmentations for machine learning purposes, in particular useful for the RAVE model https://github.com/acids-ircam/RAVE . It allows you to process audio files by changing their speed, resampling, splitting stereo files into mono, adding silence, and creating chunks.

## Features

- Change speed of the audio file
- Resample the audio file
- Split stereo files into mono
- Add silence to the audio file
- Create chunks of the audio file

## Requirements

- Python 3.6 or higher
- NumPy
- SoundFile
- Resampy

To install the required packages, you can run:

```bash
pip install numpy soundfile resampy
```

## Usage

To use this script, run it from the command line with the following arguments:

```
python augment_audio_speed.py [--chunk_duration] [--split_stereo] [--add_silence] [--speed_change]
```

- `input_folder`: Path to the input folder containing audio files
- `output_folder`: Path to the output folder for processed files
- `--chunk_duration`: (optional) Duration of each chunk in seconds (default: 30 seconds)
- `--split_stereo`: (optional) Split stereo files into two mono files
- `--add_silence`: (optional) Length of silence in seconds added to the end of each sound file
- `--speed_change`: (optional) Speed change factor 0.0-0.9 (default: 0.0, no change)

Example:

```bash
python augment_audio_speed.py input_folder output_folder --chunk_duration 30 --split_stereo --add_silence 1.5 --speed_change 0.1
```

This will process all supported audio files in the `input_folder` and save the processed files to the `output_folder` with specified augmentations.

## Supported Audio Formats

The script supports the following audio file formats:

- .wav
- .flac
- .ogg
- .aiff
- .mp3

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/materialvision/augment_audio_tools

Awesome Lists containing this project

README