https://github.com/jackaduma/cyclegan-vc2

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
https://github.com/jackaduma/cyclegan-vc2

aigc cyclegan cyclegan-vc cyclegan-vc2 deep-learning deeplearning gan pix2pix pytorch-implementation speech-synthesis voice-cloning voice-conversion

Last synced: 7 months ago
JSON representation

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

Host: GitHub
URL: https://github.com/jackaduma/cyclegan-vc2
Owner: jackaduma
License: mit
Created: 2020-05-13T11:29:35.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2023-06-10T12:00:48.000Z (over 2 years ago)
Last Synced: 2025-03-28T23:06:35.328Z (7 months ago)
Topics: aigc, cyclegan, cyclegan-vc, cyclegan-vc2, deep-learning, deeplearning, gan, pix2pix, pytorch-implementation, speech-synthesis, voice-cloning, voice-conversion
Language: Python
Homepage:
Size: 85 MB
Stars: 555
Watchers: 11
Forks: 108
Open Issues: 16
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # **CycleGAN-VC2-PyTorch**

[![standard-readme compliant](https://img.shields.io/badge/readme%20style-standard-brightgreen.svg?style=flat-square)](https://github.com/jackaduma/CycleGAN-VC2)

[![Donate](https://img.shields.io/badge/Donate-PayPal-green.svg)](https://paypal.me/jackaduma?locale.x=zh_XC)

[**中文说明**](./README.zh-CN.md) | [**English**](./README.md)

------

This code is a **PyTorch** implementation for paper: [CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion](https://arxiv.org/abs/1904.04631]), a nice work on **Voice-Conversion/Voice Cloning**.

- [x] Dataset

  - [ ] VC

  - [x] Chinese Male Speakers (S0913 from [AISHELL-Speech](https://openslr.org/33/) & [GaoXiaoSong: a Chinese star](https://en.wikipedia.org/wiki/Gao_Xiaosong))

- [x] Usage

  - [x] Training

  - [x] Example 

- [ ] Demo

- [x] Reference

------

## **Update**

**2020.11.17**: fixed issues: re-implements the second step adverserial loss.

**2020.08.27**: add the second step adverserial loss by [Jeffery-zhang-nfls](https://github.com/Jeffery-zhang-nfls)

## **CycleGAN-VC2**

### [**Project Page**](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc2/index.html)

To advance the research on non-parallel VC, we propose CycleGAN-VC2, which is an improved version of CycleGAN-VC incorporating three new techniques: an improved objective (two-step adversarial losses), improved generator (2-1-2D CNN), and improved discriminator (Patch GAN).

![network](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc2/images/network.png "network")

------

**This repository contains:** 

1. [model code](model_tf.py) which implemented the paper.

2. [audio preprocessing script](preprocess_training.py) you can use to create cache for [training data](data).

3. [training scripts](train.py) to train the model.

4. [Examples of Voice Conversion](converted_sound/) - converted result after training.

------

## **Table of Contents**

- [**CycleGAN-VC2-PyTorch**](#cyclegan-vc2-pytorch)

  - [**Update**](#update)

  - [**CycleGAN-VC2**](#cyclegan-vc2)

    - [**Project Page**](#project-page)

  - [**Table of Contents**](#table-of-contents)

  - [**Requirement**](#requirement)

  - [**Usage**](#usage)

    - [**preprocess**](#preprocess)

    - [**train**](#train)

  - [**Pretrained**](#pretrained)

  - [**Demo**](#demo)

  - [**Star-History**](#star-history)

  - [**Reference**](#reference)

  - [Donation](#donation)

  - [**License**](#license)

  

------

## **Requirement** 

```bash

pip install -r requirements.txt

```

## **Usage**

### **preprocess**

```python

python preprocess_training.py

```

is short for

```python

python preprocess_training.py --train_A_dir ./data/S0913/ --train_B_dir ./data/gaoxiaosong/ --cache_folder ./cache/

```

### **train** 

```python

python train.py

```

is short for

```python

python train.py --logf0s_normalization ./cache/logf0s_normalization.npz --mcep_normalization ./cache/mcep_normalization.npz --coded_sps_A_norm ./cache/coded_sps_A_norm.pickle --coded_sps_B_norm ./cache/coded_sps_B_norm.pickle --model_checkpoint ./model_checkpoint/ --resume_training_at ./model_checkpoint/_CycleGAN_CheckPoint --validation_A_dir ./data/S0913/ --output_A_dir ./converted_sound/S0913 --validation_B_dir ./data/gaoxiaosong/ --output_B_dir ./converted_sound/gaoxiaosong/

```

------

## **Pretrained**

a pretrained model which converted between S0913 and GaoXiaoSong

download from [Google Drive](https://drive.google.com/file/d/1iamizL98NWIPw4pw0nF-7b6eoBJrxEfj/view?usp=sharing) <735MB>

------

## **Demo**

Samples:

**reference speaker A:** [S0913(./data/S0913/BAC009S0913W0351.wav)](https://drive.google.com/file/d/14zU1mI8QtoBwb8cHkNdZiPmXI6Mj6pVW/view?usp=sharing)

**reference speaker B:** [GaoXiaoSong(./data/gaoxiaosong/gaoxiaosong_1.wav)](https://drive.google.com/file/d/1s0ip6JwnWmYoWFcEQBwVIIdHJSqPThR3/view?usp=sharing)

**speaker A's speech changes to speaker B's voice:** [Converted from S0913 to GaoXiaoSong (./converted_sound/S0913/BAC009S0913W0351.wav)](https://drive.google.com/file/d/1S4vSNGM-T0RTo_aclxRgIPkUJ7NEqmjU/view?usp=sharing)

------

## **Star-History**

![star-history](https://api.star-history.com/svg?repos=jackaduma/CycleGAN-VC2&type=Date "star-history")

------

## **Reference**

1. **CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion**. [Paper](https://arxiv.org/abs/1904.04631), [Project](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc2/index.html)

2. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. [Paper](https://arxiv.org/abs/1711.11293), [Project](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc/)

3. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. [Paper](https://arxiv.org/abs/1703.10593), [Project](https://junyanz.github.io/CycleGAN/), [Code](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix)

4. Image-to-Image Translation with Conditional Adversarial Nets. [Paper](https://arxiv.org/abs/1611.07004), [Project](https://phillipi.github.io/pix2pix/), [Code](https://github.com/phillipi/pix2pix)

------

## Donation

If this project help you reduce time to develop, you can give me a cup of coffee :) 

AliPay(支付宝)



	



WechatPay(微信)



    



[![paypal](https://www.paypalobjects.com/en_US/i/btn/btn_donateCC_LG.gif)](https://paypal.me/jackaduma?locale.x=zh_XC)

------

## **License**

[MIT](LICENSE) © Kun

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jackaduma/cyclegan-vc2

Awesome Lists containing this project

README