{"id":18430375,"url":"https://github.com/aryanvbw/aivoiceclonerpro","last_synced_at":"2025-04-07T17:33:34.236Z","repository":{"id":197419345,"uuid":"698615734","full_name":"AryanVBW/AiVoiceClonerPRO","owner":"AryanVBW","description":"Revolutionize Your Voice with AI Voice Cloner! Transform Your Speech into Your Favorite Celebrity's or Your Customized Voice. Our Cutting-edge Tool Converts Text or Any Audio into Your Desired Voice – Your Voice, Your Way","archived":false,"fork":false,"pushed_at":"2025-03-21T03:28:06.000Z","size":13711,"stargazers_count":49,"open_issues_count":12,"forks_count":7,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-22T22:04:02.629Z","etag":null,"topics":["ai","artificial-intelligence","artificial-neural-networks","aryanvbw","clone","clonevoice","pytorch","voice-clone","voice-cloneai","voiceclone"],"latest_commit_sha":null,"homepage":"https://aryanvbw.github.io/AiVoiceClonerPRO/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AryanVBW.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-30T12:55:35.000Z","updated_at":"2025-03-10T10:48:43.000Z","dependencies_parsed_at":null,"dependency_job_id":"7f27927e-a349-4357-9509-a641951d0c2d","html_url":"https://github.com/AryanVBW/AiVoiceClonerPRO","commit_stats":null,"previous_names":["aryanvbw/aivoiceclonerpro"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AryanVBW%2FAiVoiceClonerPRO","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AryanVBW%2FAiVoiceClonerPRO/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AryanVBW%2FAiVoiceClonerPRO/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AryanVBW%2FAiVoiceClonerPRO/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AryanVBW","download_url":"https://codeload.github.com/AryanVBW/AiVoiceClonerPRO/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247698108,"owners_count":20981299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","artificial-intelligence","artificial-neural-networks","aryanvbw","clone","clonevoice","pytorch","voice-clone","voice-cloneai","voiceclone"],"created_at":"2024-11-06T05:20:29.932Z","updated_at":"2025-04-07T17:33:33.119Z","avatar_url":"https://github.com/AryanVBW.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n\u003ch1\u003eAiVoiceClonerPRO\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://github.com/AryanVBW/kali-Linux-Android/releases/download/1/removebackground.png\" height=\"150\"\u003e\u003cbr\u003e\nAn easy-to-use Voice Conversion framework based on VITS, powered by python \n\u003c/p\u003e\n\n\n  \n[![Open In Colab](https://img.shields.io/badge/Colab-F9AB00?style=for-the-badge\u0026logo=googlecolab\u0026color=525252)](https://colab.research.google.com/github/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/Retrieval_based_Voice_Conversion_WebUI.ipynb)\n[![Licence](https://img.shields.io/github/license/RVC-Project/Retrieval-based-Voice-Conversion-WebUI?style=for-the-badge)](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/LICENSE)\n[![Huggingface](https://img.shields.io/badge/🤗%20-Spaces-yellow.svg?style=for-the-badge)](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/)\n\n[![Discord](https://img.shields.io/badge/RVC%20Developers-Discord-7289DA?style=for-the-badge\u0026logo=discord\u0026logoColor=white)](https://discord.gg/HcsmBBGyVk)\n\n\u003c/div\u003e\n\n------\n[**Changelog**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/Changelog_EN.md) | [**FAQ (Frequently Asked Questions)**](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/FAQ-(Frequently-Asked-Questions)) \n\nRealtime Voice Conversion Software using RVC : [w-okada/voice-changer](https://github.com/w-okada/voice-changer)\n\n\n\u003e The dataset for the pre-training model uses nearly 50 hours of high quality VCTK open source dataset.\n\n\u003e High quality licensed song datasets will be added to training-set one after another for your use, without worrying about copyright infringement.\n\n\u003e Please look forward to the pretrained base model of RVCv3, which has larger parameters, more training data, better results, unchanged inference speed, and requires less training data for training.\n\n## Summary\nThis repository has the following features:\n+ Reduce tone leakage by replacing the source feature to training-set feature using top1 retrieval;\n+ Easy and fast training, even on relatively poor graphics cards;\n+ Training with a small amount of data also obtains relatively good results (\u003e=10min low noise speech recommended);\n+ Supporting model fusion to change timbres (using ckpt processing tab-\u003eckpt merge);\n+ Easy-to-use Webui interface;\n+ Use the UVR5 model to quickly separate vocals and instruments.\n+ Use the most powerful High-pitch Voice Extraction Algorithm [InterSpeech2023-RMVPE](#Credits) to prevent the muted sound problem. Provides the best results (significantly) and is faster, with even lower resource consumption than Crepe_full.\n+ AMD/Intel graphics cards acceleration supported.\n+ Intel ARC graphics cards acceleration with IPEX supported.\n\n## Preparing the environment\nThe following commands need to be executed in the environment of Python version 3.8 or higher.\n\n(Windows/Linux)\nFirst install the main dependencies through pip:\n```bash\n# Install PyTorch-related core dependencies, skip if installed\n# Reference: https://pytorch.org/get-started/locally/\npip install torch torchvision torchaudio\n\n#For Windows + Nvidia Ampere Architecture(RTX30xx), you need to specify the cuda version corresponding to pytorch according to the experience of https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/21\n#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117\n\n#For Linux + AMD Cards, you need to use the following pytorch versions:\n#pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2\n```\n\nThen can use poetry to install the other dependencies:\n```bash\n# Install the Poetry dependency management tool, skip if installed\n# Reference: https://python-poetry.org/docs/#installation\ncurl -sSL https://install.python-poetry.org | python3 -\n\n# Install the project dependencies\npoetry install\n```\n\nYou can also use pip to install them:\n```bash\n\nfor Nvidia graphics cards\n  pip install -r requirements.txt\n\nfor AMD/Intel graphics cards on Windows (DirectML)：\n  pip install -r requirements-dml.txt\n\nfor Intel ARC graphics cards on Linux / WSL using Python 3.10: \n  pip install -r requirements-ipex.txt\n\nfor AMD graphics cards on Linux (ROCm):\n  pip install -r requirements-amd.txt\n```\n\n------\nMac users can install dependencies via `run.sh`:\n```bash\nsh ./run.sh\n```\n\n## Preparation of other Pre-models\nRVC requires other pre-models to infer and train.\n\nYou need to download them from our [Huggingface space](https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main/).\n\nHere's a list of Pre-models and other files that RVC needs:\n```bash\n./assets/hubert/hubert_base.pt\n\n./assets/pretrained \n\n./assets/uvr5_weights\n\nAdditional downloads are required if you want to test the v2 version of the model.\n\n./assets/pretrained_v2\n\nIf you want to test the v2 version model (the v2 version model has changed the input from the 256 dimensional feature of 9-layer Hubert+final_proj to the 768 dimensional feature of 12-layer Hubert, and has added 3 period discriminators), you will need to download additional features\n\n./assets/pretrained_v2\n\n#If you are using Windows, you may also need these two files, skip if FFmpeg and FFprobe are installed\nffmpeg.exe\n\nhttps://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffmpeg.exe\n\nffprobe.exe\n\nhttps://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/ffprobe.exe\n\nIf you want to use the latest SOTA RMVPE vocal pitch extraction algorithm, you need to download the RMVPE weights and place them in the RVC root directory\n\nhttps://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.pt\n\n    For AMD/Intel graphics cards users you need download:\n\n    https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/rmvpe.onnx\n\n```\n\nIntel ARC graphics cards users needs to run `source /opt/intel/oneapi/setvars.sh` command before starting Webui.\n\nThen use this command to start Webui:\n```bash\npython infer-web.py\n```\n\nIf you are using Windows or macOS, you can download and extract `RVC-beta.7z` to use RVC directly by using `go-web.bat` on windows or `sh ./run.sh` on macOS to start Webui.\n\n## ROCm Support for AMD graphic cards (Linux only)\nTo use ROCm on Linux install all required drivers as described [here](https://rocm.docs.amd.com/en/latest/deploy/linux/os-native/install.html).\n\nOn Arch use pacman to install the driver:\n````\npacman -S rocm-hip-sdk rocm-opencl-sdk\n````\n\nYou might also need to set these environment variables (e.g. on a RX6700XT):\n````\nexport ROCM_PATH=/opt/rocm\nexport HSA_OVERRIDE_GFX_VERSION=10.3.0\n````\nAlso make sure your user is part of the `render` and `video` group:\n````\nsudo usermod -aG render $USERNAME\nsudo usermod -aG video $USERNAME\n````\nAfter that you can run the WebUI:\n```bash\npython infer-web.py\n```\n\n## Credits\n+ [RVC-Projects](https://github.com/RVC-Project)\n+ [ContentVec](https://github.com/auspicious3000/contentvec/)\n+ [VITS](https://github.com/jaywalnut310/vits)\n+ [HIFIGAN](https://github.com/jik876/hifi-gan)\n+ [Gradio](https://github.com/gradio-app/gradio)\n+ [FFmpeg](https://github.com/FFmpeg/FFmpeg)\n+ [Ultimate Vocal Remover](https://github.com/Anjok07/ultimatevocalremovergui)\n+ [audio-slicer](https://github.com/openvpi/audio-slicer)\n+ [Vocal pitch extraction:RMVPE](https://github.com/Dream-High/RMVPE)\n  + The pretrained model is trained and tested by [yxlllc](https://github.com/yxlllc/RMVPE) and [RVC-Boss](https://github.com/RVC-Boss).\n  \n## Thanks to all contributors for their efforts\n\u003ca href=\"https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/graphs/contributors\" target=\"_blank\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=RVC-Project/Retrieval-based-Voice-Conversion-WebUI\" /\u003e\n\u003c/a\u003e\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faryanvbw%2Faivoiceclonerpro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faryanvbw%2Faivoiceclonerpro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faryanvbw%2Faivoiceclonerpro/lists"}