{"id":19589695,"url":"https://github.com/jackaduma/cyclegan-vc3","last_synced_at":"2025-04-27T12:32:49.110Z","repository":{"id":37751110,"uuid":"307754775","full_name":"jackaduma/CycleGAN-VC3","owner":"jackaduma","description":"Voice Conversion by CycleGAN (语音克隆/语音转换)：CycleGAN-VC3","archived":false,"fork":false,"pushed_at":"2022-05-05T02:52:52.000Z","size":397,"stargazers_count":128,"open_issues_count":4,"forks_count":24,"subscribers_count":10,"default_branch":"main","last_synced_at":"2023-11-07T18:24:46.114Z","etag":null,"topics":["aigc","cyclegan","cyclegan-vc","cyclegan-vc2","cyclegan-vc3","gan","pytorch","pytorch-implementation","voice-cloning","voice-conversion"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jackaduma.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-10-27T16:03:09.000Z","updated_at":"2023-10-29T13:20:35.000Z","dependencies_parsed_at":"2022-09-01T04:12:20.070Z","dependency_job_id":null,"html_url":"https://github.com/jackaduma/CycleGAN-VC3","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jackaduma%2FCycleGAN-VC3","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jackaduma%2FCycleGAN-VC3/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jackaduma%2FCycleGAN-VC3/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jackaduma%2FCycleGAN-VC3/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jackaduma","download_url":"https://codeload.github.com/jackaduma/CycleGAN-VC3/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224070383,"owners_count":17250651,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aigc","cyclegan","cyclegan-vc","cyclegan-vc2","cyclegan-vc3","gan","pytorch","pytorch-implementation","voice-cloning","voice-conversion"],"created_at":"2024-11-11T08:20:18.140Z","updated_at":"2024-11-11T08:20:19.030Z","avatar_url":"https://github.com/jackaduma.png","language":"Python","funding_links":["https://paypal.me/jackaduma?locale.x=zh_XC"],"categories":[],"sub_categories":[],"readme":"# **CycleGAN-VC3-PyTorch**\n\n[![standard-readme compliant](https://img.shields.io/badge/readme%20style-standard-brightgreen.svg?style=flat-square)](https://github.com/jackaduma/CycleGAN-VC2)\n[![Donate](https://img.shields.io/badge/Donate-PayPal-green.svg)](https://paypal.me/jackaduma?locale.x=zh_XC)\n\n[**中文说明**](./README.zh-CN.md) | [**English**](./README.md)\n\n------\n\nThis code is a **PyTorch** implementation for paper: [CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion](https://arxiv.org/abs/2010.11672]), a nice work on **Voice-Conversion/Voice Cloning**.\n\n- [x] Dataset\n  - [ ] VC\n- [x] Usage\n  - [x] Training\n  - [x] Example \n- [ ] Demo\n- [x] Reference\n\n------\n\n## **CycleGAN-VC3**\n\n### [**Project Page**](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc3/index.html) \n\n\nNon-parallel voice conversion (VC) is a technique for learning mappings between source and target speeches without using a parallel corpus. Recently, CycleGAN-VC [3] and CycleGAN-VC2 [2] have shown promising results regarding this problem and have been widely used as benchmark methods. However, owing to the ambiguity of the effectiveness of CycleGAN-VC/VC2 for **mel-spectrogram conversion**, they are typically used for mel-cepstrum conversion even when comparative methods employ mel-spectrogram as a conversion target. To address this, we examined the applicability of CycleGAN-VC/VC2 to **mel-spectrogram conversion**. Through initial experiments, we discovered that their direct applications compromised the time-frequency structure that should be preserved during conversion. To remedy this, we propose CycleGAN-VC3, an improvement of CycleGAN-VC2 that incorporates **time-frequency adaptive normalization (TFAN)**. Using TFAN, we can adjust the scale and bias of the converted features while reflecting the time-frequency structure of the source mel-spectrogram. We evaluated CycleGAN-VC3 on inter-gender and intra-gender non-parallel VC. A subjective evaluation of naturalness and similarity showed that for every VC pair, CycleGAN-VC3 outperforms or is competitive with the two types of CycleGAN-VC2, one of which was applied to mel-cepstrum and the other to mel-spectrogram.\n\n![network comparison](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc3/images/comparison.png \"comparison between vc2 and vc3\")  _Figure 1. We developed time-frequency adaptive normalization (TFAN), which extends instance normalization [5] so that the affine parameters become element-dependent and are determined according to an entire input mel-spectrogram._\n\n------\n\n**This repository contains:** \n\n1. [TFAN module code](tfan_module.py) which implemented the TFAN module\n1. [model code](model.py) which implemented the model network.\n2. [audio preprocessing script](preprocess_training.py) you can use to create cache for [training data](data).\n3. [training scripts](train.py) to train the model.\n\n\n\n------\n\n## **Table of Contents**\n\n- [**CycleGAN-VC3-PyTorch**](#cyclegan-vc3-pytorch)\n  - [**CycleGAN-VC3**](#cyclegan-vc3)\n    - [**Project Page**](#project-page)\n  - [**Table of Contents**](#table-of-contents)\n  - [**Requirement**](#requirement)\n  - [**Usage**](#usage)\n  - [**Star-History**](#star-history)\n  - [**Reference**](#reference)\n  - [Donation](#donation)\n  - [**License**](#license)\n  \n------\n\n## **Requirement** \n\n```bash\npip install -r requirements.txt\n```\n## **Usage**\n\n\n------\n\n## **Star-History**\n\n![star-history](https://api.star-history.com/svg?repos=jackaduma/CycleGAN-VC3\u0026type=Date \"star-history\")\n\n------\n\n## **Reference**\n1. **CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion.** [Paper](https://arxiv.org/abs/2010.11672), [Project](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc3/index.html)\n2. CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion. [Paper](https://arxiv.org/abs/1904.04631), [Project](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc2/index.html)\n3. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. [Paper](https://arxiv.org/abs/1711.11293), [Project](http://www.kecl.ntt.co.jp/people/kaneko.takuhiro/projects/cyclegan-vc/)\n4. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. [Paper](https://arxiv.org/abs/1703.10593), [Project](https://junyanz.github.io/CycleGAN/), [Code](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix)\n5. Image-to-Image Translation with Conditional Adversarial Nets. [Paper](https://arxiv.org/abs/1611.07004), [Project](https://phillipi.github.io/pix2pix/), [Code](https://github.com/phillipi/pix2pix)\n\n\n------\n\n## Donation\nIf this project help you reduce time to develop, you can give me a cup of coffee :) \n\nAliPay(支付宝)\n\u003cdiv align=\"center\"\u003e\n\t\u003cimg src=\"./misc/ali_pay.png\" alt=\"ali_pay\" width=\"400\" /\u003e\n\u003c/div\u003e\n\nWechatPay(微信)\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./misc/wechat_pay.png\" alt=\"wechat_pay\" width=\"400\" /\u003e\n\u003c/div\u003e\n\n[![paypal](https://www.paypalobjects.com/en_US/i/btn/btn_donateCC_LG.gif)](https://paypal.me/jackaduma?locale.x=zh_XC)\n\n\n------\n\n## **License**\n\n[MIT](LICENSE) © Kun","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjackaduma%2Fcyclegan-vc3","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjackaduma%2Fcyclegan-vc3","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjackaduma%2Fcyclegan-vc3/lists"}