{"id":20989405,"url":"https://github.com/haoheliu/voicefixer_main","last_synced_at":"2025-04-06T12:09:39.411Z","repository":{"id":45882488,"uuid":"410468305","full_name":"haoheliu/voicefixer_main","owner":"haoheliu","description":"General Speech Restoration","archived":false,"fork":false,"pushed_at":"2024-01-13T00:38:35.000Z","size":22554,"stargazers_count":276,"open_issues_count":11,"forks_count":56,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-03-30T10:09:30.800Z","etag":null,"topics":["machine-learning","speech","speech-analysis","speech-enhancement","speech-processing","speech-synthesis","speech-to-text","tts"],"latest_commit_sha":null,"homepage":"https://haoheliu.github.io/demopage-voicefixer/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/haoheliu.png","metadata":{"files":{"readme":"README.MD","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-26T06:20:37.000Z","updated_at":"2025-03-12T18:15:29.000Z","dependencies_parsed_at":"2024-11-19T06:40:38.658Z","dependency_job_id":null,"html_url":"https://github.com/haoheliu/voicefixer_main","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/haoheliu%2Fvoicefixer_main","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/haoheliu%2Fvoicefixer_main/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/haoheliu%2Fvoicefixer_main/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/haoheliu%2Fvoicefixer_main/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/haoheliu","download_url":"https://codeload.github.com/haoheliu/voicefixer_main/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247478323,"owners_count":20945266,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","speech","speech-analysis","speech-enhancement","speech-processing","speech-synthesis","speech-to-text","tts"],"created_at":"2024-11-19T06:24:45.460Z","updated_at":"2025-04-06T12:09:39.380Z","avatar_url":"https://github.com/haoheliu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![arXiv](https://img.shields.io/badge/arXiv-2109.13731-brightgreen.svg?style=flat-square)](https://arxiv.org/abs/2109.13731) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1HYYUepIsl2aXsdET6P_AmNVXuWP1MCMf?usp=sharing) [![PyPI version](https://badge.fury.io/py/voicefixer.svg)](https://badge.fury.io/py/voicefixer) [![githubio](https://img.shields.io/badge/GitHub.io-Audio_Samples-blue?logo=Github\u0026style=flat-square)](https://haoheliu.github.io/demopage-voicefixer)\n\n**2021-11-06: I have just updated the code structure to make it easier to understand. It may have potential bug now. I will do some test training later.**\n\n~~**2021-11-01: I will update the code and make it easier to use later.**~~\n\n# VoiceFixer\n\nVoiceFixer is a framework for general speech restoration. We aim at the restoration of severely degraded speech and historical speech.\n\n- [VoiceFixer](#voicefixer)\n  * [Materials](#materials)\n  * [Usage](#usage)\n    + [Environment (Do this at first)](#environment--do-this-at-first-)\n    + [VoiceFixer for general speech restoration](#voicefixer-for-general-speech-restoration)\n    + [ResUNet for general speech restoration](#resunet-for-general-speech-restoration)\n    + [ResUNet for single task speech restoration](#resunet-for-single-task-speech-restoration)\n  * [Citation](#citation)\n  \n## Materials\n\n- *Arxiv* preprint: https://arxiv.org/abs/2109.13731 \n- [Demo page](https://haoheliu.github.io/demopage-voicefixer/) contains comparison between single task speech restoration, general speech restoration, and voicefixer.\n- We wrote a [pip package](https://pypi.org/project/voicefixer) for voicefixer.\n- The dataset we use in this repo: [training and testing datasets](https://zenodo.org/record/5546723#.YYaWE05BxaQ)\n\n## Usage\n\n### Environment (Do this at first)\n```shell script\n# Download dataset and prepare running environment\ngit clone https://github.com/haoheliu/voicefixer_main.git\ncd voicefixer_main\nsource init.sh \n```\n\n### VoiceFixer for general speech restoration\n**Here we take *VF_UNet*(voicefixer with unet as analysis module) as an example.**\n\n- Training\n```shell\n# pass in a configuration file to the training script\npython3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json # you can modify the configuration file to personalize your training\n```\nYou can checkout the *logs* directory for checkpoints, logging and validation results.\n\n- Evaluation\n\nAutomatic evaluation and generating .csv file on all testsets.\n\nFor example, if you like to evaluate on all testset (default). \n```shell script\npython3 eval_gsr_voicefixer.py  \\\n                    --config  \u003cpath-to-the-config-file\u003e \\\n                    --ckpt  \u003cpath-to-the-checkpoint\u003e \n```\n\nFor example, if you just wanna evaluate on GSR testset. \n```shell script\npython3 eval_gsr_voicefixer.py  \n                    --config  \u003cpath-to-the-config-file\u003e \\\n                    --ckpt  \u003cpath-to-the-checkpoint\u003e \\\n                    --testset  general_speech_restoration \\ \n                    --description  general_speech_restoration_eval \n```\n\nThere are generally seven testsets you can pass to **--testset**: \n- **base**: all testset\n- **clip**: testset with speech that have clipping threshold of 0.1, 0.25, and 0.5\n- **reverb**: testset with reverberate speech\n- **general_speech_restoration**: testset with speech that contain all kinds of random distortions\n- **enhancement**: testset with noisy speech\n- **speech_super_resolution**: testset with low resolution speech that have sampling rate of 2kHz, 4kHz, 8kHz, 16kHz, and 24kHz.\n\nAnd if you would like to evaluate on a small portion of data, e.g. 10 utterance. You can pass the number to **--limit_numbers** argument.\n\n```shell script\npython3 eval_gsr_voicefixer.py  \\\n                    --config  \u003cpath-to-the-config-file\u003e \\\n                    --ckpt  \u003cpath-to-the-checkpoint\u003e \\\n                    --limit_numbers 10 \n```\n\nEvaluation results will be presented in the *exp_results* folder.\n\n### ResUNet for general speech restoration\n\n- Training \n```shell\n# pass in a configuration file to the training script\npython3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json\n```\nYou can checkout the *logs* directory for checkpoints, logging and validation results.\n\n- Evaluation (similar to voicefixer evaluation)\n    ```shell script\n    python3 eval_ssr_unet.py  \n                        --config  \u003cpath-to-the-config-file\u003e \\\n                        --ckpt  \u003cpath-to-the-checkpoint\u003e \\\n                        --limit_numbers \u003cint-test-only-on-a-few-utterance\u003e \\\n                        --testset  \u003cthe-testset-you-want-to-use\u003e \\ \n                        --description  \u003cdescribe-this-test\u003e\n    ```\n\n### ResUNet for single task speech restoration\n\n- Training \n  - Denoising\n  ```shell\n  # pass in a configuration file to the training script\n  python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_denoising.json\n  ```\n\n  - Dereverberation\n  ```shell\n  # pass in a configuration file to the training script\n  python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_dereverberation.json\n  ```\n  \n  - Super Resolution\n  ```shell\n  # pass in a configuration file to the training script\n  python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_super_resolution.json\n  ```\n  \n  - Declipping\n  ```shell\n  # pass in a configuration file to the training script\n  python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_declipping.json\n  ```\n\nYou can checkout the *logs* directory for checkpoints, logging and validation results.\n\n  - Evaluation (similar to voicefixer evaluation)\n    ```shell script\n    python3 eval_ssr_unet.py  \n                        --config  \u003cpath-to-the-config-file\u003e \\\n                        --ckpt  \u003cpath-to-the-checkpoint\u003e \\\n                        --limit_numbers \u003cint-test-only-on-a-few-utterance\u003e \\\n                        --testset  \u003cthe-testset-you-want-to-use\u003e \\ \n                        --description  \u003cdescribe-this-test\u003e\n    ```\n\n## Citation\n\n```bib\n @misc{liu2021voicefixer,   \n     title={VoiceFixer: Toward General Speech Restoration With Neural Vocoder},   \n     author={Haohe Liu and Qiuqiang Kong and Qiao Tian and Yan Zhao and DeLiang Wang and Chuanzeng Huang and Yuxuan Wang},  \n     year={2021},  \n     eprint={2109.13731},  \n     archivePrefix={arXiv},  \n     primaryClass={cs.SD}  \n }\n```\n\n![real-life-example](resources/pics/real.png)\n![real-life-example](resources/pics/gsr-demo.png)\n![real-life-example](resources/pics/SR-2k.png)\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhaoheliu%2Fvoicefixer_main","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhaoheliu%2Fvoicefixer_main","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhaoheliu%2Fvoicefixer_main/lists"}