{"id":19684154,"url":"https://github.com/PlayVoice/VI-Speaker","last_synced_at":"2025-04-29T05:32:10.022Z","repository":{"id":120308564,"uuid":"537271602","full_name":"PlayVoice/VI-Speaker","owner":"PlayVoice","description":"Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.","archived":false,"fork":false,"pushed_at":"2022-09-16T15:00:05.000Z","size":64,"stargazers_count":29,"open_issues_count":2,"forks_count":3,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-04-05T13:38:10.336Z","etag":null,"topics":["speaker-embedding","speaker-identification","vits","voice-clone"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PlayVoice.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2022-09-16T02:02:32.000Z","updated_at":"2024-12-16T02:20:33.000Z","dependencies_parsed_at":null,"dependency_job_id":"5c7632c2-c9bd-4aef-871d-99c5f116d37c","html_url":"https://github.com/PlayVoice/VI-Speaker","commit_stats":null,"previous_names":["yuchendd/vi-speaker","maxmax2016/vi-speaker","playvoice/vi-speaker"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PlayVoice%2FVI-Speaker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PlayVoice%2FVI-Speaker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PlayVoice%2FVI-Speaker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PlayVoice%2FVI-Speaker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PlayVoice","download_url":"https://codeload.github.com/PlayVoice/VI-Speaker/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251444694,"owners_count":21590557,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["speaker-embedding","speaker-identification","vits","voice-clone"],"created_at":"2024-11-11T18:16:59.324Z","updated_at":"2025-04-29T05:32:09.744Z","avatar_url":"https://github.com/PlayVoice.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# VI-Speaker\nSpeaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.\n\n# code from mozill_tts and Coqpit/TTS\nhttps://github.com/mozilla/TTS/tree/master/TTS/speaker_encoder\n\nhttps://github.com/coqui-ai/TTS\n\npip install coqpit\n\n# download model，\nhttps://github.com/mozilla/TTS/wiki/Released-Models\n\nSpeaker-Encoder by @mueller91\tLibriTTS + VCTK + VoxCeleb + CommonVoice\n\nhttps://drive.google.com/drive/folders/15oeBYf6Qn1edONkVLXe82MzdIi3O_9m3\n\nOr get it at release **saved_models.zip** \n\n# use\npython vi_speaker_single.py ./saved_models/best_model.pth.tar ./saved_models/config.json -s TEST.wav -t TEST.npy\n\n# batch use\npython vi_speaker_batch.py ./saved_models/best_model.pth.tar ./saved_models/config.json ./data/waves ./speaker_embedding\n\n    data/\n    └── waves\n        ├── spk1\n        │   ├── 000002.wav\n        │   ├── 000006.wav\n        │   └── 000038.wav\n        └── spk2\n            ├── 000040.wav\n            ├── 000044.wav\n            └── 000077.wav\n\n    speaker_embedding/\n    ├── spk1\n    │   ├── 000002.npy\n    │   ├── 000006.npy\n    │   └── 000038.npy\n    └── spk2\n        ├── 000040.npy\n        ├── 000044.npy\n        └── 000077.npy\n\n# compute speaker center\ninput path = speaker_embedding, output path = speaker_embedding_center\n\npython vi_speaker_center.py\n\n    speaker_embedding_center/\n    ├── spk1.npy\n    └── spk2.npy\n\n\n# for VI-SVC\nmv speaker_embedding_center data/spkid\n\n    data/\n    ├── waves\n    │   ├── 10001\n    │   ├── 20400\n    │   │   ├── 20400_001.wav\n    │   │   ├── 20456_019.wav\n    │   │   \n    ├── phone\n    │   ├── 10001\n    │   ├── 20400\n    │   │   ├── 20400_001.npy\n    │   │   ├── 20456_019.npy\n    │   │   \n    ├── lable\n    │   ├── 10001\n    │   ├── 20400\n    │   │   ├── 20400_001.npy\n    │   │   ├── 20456_019.npy\n    │   │   \n    ├── spkid\n    │   ├── 10001.npy\n    │   ├── 20400.npy\n    │   │   \n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPlayVoice%2FVI-Speaker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FPlayVoice%2FVI-Speaker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPlayVoice%2FVI-Speaker/lists"}