https://github.com/voidful/soundon

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/voidful/soundon
Owner: voidful
Created: 2024-07-24T13:19:31.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-09-08T06:20:22.000Z (almost 2 years ago)
Last Synced: 2025-03-19T06:04:36.784Z (over 1 year ago)
Language: Python
Size: 130 KB
Stars: 1
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          
# SoundOn: Any Codec to Mel Spectrogram

SoundOn is a project that converts audio files to mel spectrograms, a format suitable for training sound generation models. This repository contains scripts to prepare the dataset and convert audio files to mel spectrograms.

## Getting Started

### Dependencies

- Python 3.8 or later

- PyTorch

- torchaudio

- librosa (optional, for additional audio processing)

Ensure you have Python installed on your system. This project is developed and tested on macOS, but it should be compatible with Linux and Windows, provided all dependencies are met.

### Installing

Clone the repository to your local machine:

```bash

git clone https://github.com/yourusername/soundon.git

cd soundon

```

### Preparing the Dataset

Place your `.wav` audio files in a directory. Update the `directory` parameter in the `MelDataset` instantiation in `dataset_prepare.py` to point to this directory.

### Running the Dataset Preparation

Execute `dataset_prepare.py` to start the dataset preparation process:

```bash

python dataset_prepare.py

```

This script will process all `.wav` files in the specified directory, converting them into mel spectrograms and printing their shapes.

## Usage

The `MelDataset` class can be used as follows:

```python

from dataset_prepare import MelDataset

from torch.utils.data import DataLoader

params = {

    'sampling_rate': 24000,

    'n_fft': 1024,

    'win_size': 1024,

    'hop_size': 256,

    'num_mels': 100,

    'fmin': 0,

    'fmax': None,

}

dataset = MelDataset(directory='/path/to/audio/files', params=params)

dataloader = DataLoader(dataset, batch_size=32, shuffle=True, num_workers=4)

for mel_spec in dataloader:

    print(mel_spec.shape)

```

## Contributing

Contributions to the project are welcome! Please fork the repository, create a feature branch, and submit a pull request.

## License

This project is licensed under the MIT License - see the `LICENSE` file for details.

## Acknowledgments

- The `code2mel.py` and `dataset_prepare.py` scripts are foundational to this project, enabling the conversion of audio files to a format suitable for sound generation models.

```

This `README.md` template provides a comprehensive overview of your project, including how to get started, use the project, and contribute. Adjust the repository URL and any specific details as necessary.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/voidful/soundon

Awesome Lists containing this project

README