https://github.com/ivanlmh/murgan

Simple singing style transfer project using GANs
https://github.com/ivanlmh/murgan

generative-adversarial-network murga-singing singing-voice-conversion style-transfer

Last synced: 4 months ago
JSON representation

Simple singing style transfer project using GANs

Host: GitHub
URL: https://github.com/ivanlmh/murgan
Owner: ivanlmh
License: apache-2.0
Created: 2023-10-30T11:41:39.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2023-11-01T10:26:45.000Z (almost 2 years ago)
Last Synced: 2024-01-29T06:17:28.019Z (over 1 year ago)
Topics: generative-adversarial-network, murga-singing, singing-voice-conversion, style-transfer
Language: Python
Homepage:
Size: 22.5 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# murGAN: Singing Voice to Murga singing Style Transfer

This repository contains a simple approach to transfer the singing style of classical or pop singers to the unique style of Uruguayan Murga (of course, it can actually be trained to transfer to any style).

*Murga is a modern musical expression native to Uruguay. The singing has distinctive characteristics, especially when compared to pop singers or classical European choral singing. There are significant differences in spectral centroid and spectral flatness. There are significant differences in spectral centroid and spectral flatness. Murga singing often presents deviation from the fundamental frequency, and not much vibrato, among other particularities in terms of intonation and vocal expression.*

I use a Generative Adversarial Network architecture, copying the idea from the StarGAN architecture, that the discriminator can not only learn to distinguish real from fake audio, but also of the output of the generator is from a desired domain.

## Getting Started

1. Clone the repository:
```bash
git clone git@github.com:ivanlmh/murGAN.git
cd murGAN
```

2. Set up the environment:
```bash
conda env create -f environment.yml
conda activate murGAN
```

3. Install project for development
```bash
pip install -e .
```

## Dataset Preparation

The datasets I use consist of two folders: one with "Murga" style singing and another with non-murga (classical, pop, etc) singing. Each track in the dataset should be at least 10 seconds long to ensure consistent input size for the model. For non-murga I use the vocals from MUSDB18 and the stems from the Choral Singing Dataset (see referances).

In the ```scripts``` folder you will find a script that creates symlinks of audios, so that you can add files from other projects to a local ```data``` folder, and keep things clean while not taking up more space.

## Model Architecture

The model leverages the StarGAN paradigm. For more details on the architecture, refer to the original [StarGAN paper](https://arxiv.org/abs/1711.09020).

## Training

To train the model, run:

```bash
python src/train.py
```

Ensure that both murga and classic datasets are prepared and placed in their respective directories.

## Inference

Once the model is trained, convert classical singing to the "Murga" style using:

```bash
python src/inference.py --input "path_to_input_classical_audio.wav" --output "path_to_output_murga_audio.wav"
```

## Tests
To run unit tests

```bash
python -m unittest discover tests
```

## Acknowledgments

- ChoralSingingDataset https://zenodo.org/records/2649950
- MUSDB18 https://sigsep.github.io/datasets/musdb.html#musdb18-compressed-stems
- StarGAN https://arxiv.org/abs/1711.09020

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ivanlmh/murgan

Awesome Lists containing this project

README