{"id":15887341,"url":"https://github.com/lipku/livetalking","last_synced_at":"2025-05-13T17:13:18.015Z","repository":{"id":214428510,"uuid":"733283939","full_name":"lipku/LiveTalking","owner":"lipku","description":"Real time interactive streaming digital human","archived":false,"fork":false,"pushed_at":"2025-05-01T13:02:57.000Z","size":46609,"stargazers_count":5478,"open_issues_count":306,"forks_count":810,"subscribers_count":57,"default_branch":"main","last_synced_at":"2025-05-06T21:03:35.452Z","etag":null,"topics":["aigc","digihuman","digital-human","er-nerf","lip-sync","metahuman-stream","musetalk","nerf","realtime","streaming","talking-head","virtualhumans","wav2lip"],"latest_commit_sha":null,"homepage":"https://livetalking-doc.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lipku.png","metadata":{"files":{"readme":"README-EN.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["lipku"]}},"created_at":"2023-12-19T01:32:46.000Z","updated_at":"2025-05-06T15:48:59.000Z","dependencies_parsed_at":"2024-03-09T03:20:02.568Z","dependency_job_id":"f08169e1-d63e-490a-aa43-24f027d7c3ea","html_url":"https://github.com/lipku/LiveTalking","commit_stats":null,"previous_names":["lipku/metahuman-stream","lipku/livetalking"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lipku%2FLiveTalking","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lipku%2FLiveTalking/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lipku%2FLiveTalking/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lipku%2FLiveTalking/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lipku","download_url":"https://codeload.github.com/lipku/LiveTalking/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253990492,"owners_count":21995776,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aigc","digihuman","digital-human","er-nerf","lip-sync","metahuman-stream","musetalk","nerf","realtime","streaming","talking-head","virtualhumans","wav2lip"],"created_at":"2024-10-06T06:01:55.350Z","updated_at":"2025-05-13T17:13:12.985Z","avatar_url":"https://github.com/lipku.png","language":"Python","funding_links":["https://github.com/sponsors/lipku"],"categories":["Python"],"sub_categories":[],"readme":"Real-time interactive streaming digital human enables synchronous audio and video dialogue. It can basically achieve commercial effects.\n\n[Effect of wav2lip](https://www.bilibili.com/video/BV1scwBeyELA/) | [Effect of ernerf](https://www.bilibili.com/video/BV1G1421z73r/) |  [Effect of musetalk](https://www.bilibili.com/video/BV1gm421N7vQ/)  \n\n## News\n- December 8, 2024: Improved multi-concurrency, and the video memory does not increase with the number of concurrent connections.\n- December 21, 2024: Added model warm-up for wav2lip and musetalk to solve the problem of stuttering during the first inference. Thanks to [@heimaojinzhangyz](https://github.com/heimaojinzhangyz)\n- December 28, 2024: Added the digital human model Ultralight-Digital-Human. Thanks to [@lijihua2017](https://github.com/lijihua2017)\n- February 7, 2025: Added fish-speech tts\n- February 21, 2025: Added the open-source model wav2lip256. Thanks to @不蠢不蠢\n- March 2, 2025: Added Tencent's speech synthesis service\n- March 16, 2025: Supports mac gpu inference. Thanks to [@GcsSloop](https://github.com/GcsSloop) \n\n## Features\n1. Supports multiple digital human models: ernerf, musetalk, wav2lip, Ultralight-Digital-Human\n2. Supports voice cloning\n3. Supports interrupting the digital human while it is speaking\n4. Supports full-body video stitching\n5. Supports rtmp and webrtc\n6. Supports video arrangement: Play custom videos when not speaking\n7. Supports multi-concurrency\n\n## 1. Installation\n\nTested on Ubuntu 20.04, Python 3.10, Pytorch 1.12 and CUDA 11.3\n\n### 1.1 Install dependency\n\n```bash\nconda create -n nerfstream python=3.10\nconda activate nerfstream\n# If the cuda version is not 11.3 (confirm the version by running nvidia-smi), install the corresponding version of pytorch according to \u003chttps://pytorch.org/get-started/previous-versions/\u003e \nconda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch\npip install -r requirements.txt\n# If you need to train the ernerf model, install the following libraries\n# pip install \"git+https://github.com/facebookresearch/pytorch3d.git\"\n# pip install tensorflow-gpu==2.8.0\n# pip install --upgrade \"protobuf\u003c=3.20.1\"\n``` \nCommon installation issues [FAQ](https://livetalking-doc.readthedocs.io/en/latest/faq.html)  \nFor setting up the linux cuda environment, you can refer to this article https://zhuanlan.zhihu.com/p/674972886\n\n\n## 2. Quick Start\n- Download the models  \nQuark Cloud Disk \u003chttps://pan.quark.cn/s/83a750323ef0\u003e    \nGoogle Drive \u003chttps://drive.google.com/drive/folders/1FOC_MD6wdogyyX_7V1d4NDIO7P9NlSAJ?usp=sharing\u003e  \nCopy wav2lip256.pth to the models folder of this project and rename it to wav2lip.pth;  \nExtract wav2lip256_avatar1.tar.gz and copy the entire folder to the data/avatars folder of this project.\n- Run  \npython app.py --transport webrtc --model wav2lip --avatar_id wav2lip256_avatar1  \nOpen http://serverip:8010/webrtcapi.html in a browser. First click'start' to play the digital human video; then enter any text in the text box and submit it. The digital human will broadcast this text.  \n\u003cfont color=red\u003eThe server side needs to open ports tcp:8010; udp:1-65536\u003c/font\u003e  \nIf you need to purchase a high-definition wav2lip model for commercial use, [Link](https://livetalking-doc.readthedocs.io/zh-cn/latest/service.html#wav2lip).  \n\n- Quick experience  \n\u003chttps://www.compshare.cn/images-detail?ImageID=compshareImage-18tpjhhxoq3j\u0026referral_code=3XW3852OBmnD089hMMrtuU\u0026ytag=GPU_GitHub_livetalking1.3\u003e Create an instance with this image to run it.\n\nIf you can't access huggingface, before running\n```\nexport HF_ENDPOINT=https://hf-mirror.com\n``` \n\n\n## 3. More Usage\nUsage instructions: \u003chttps://livetalking-doc.readthedocs.io/en/latest\u003e\n  \n## 4. Docker Run  \nNo need for the previous installation, just run directly.\n```\ndocker run --gpus all -it --network=host --rm registry.cn-beijing.aliyuncs.com/codewithgpu2/lipku-metahuman-stream:2K9qaMBu8v\n```\nThe code is in /root/metahuman-stream. First, git pull to get the latest code, and then execute the commands as in steps 2 and 3. \n\nThe following images are provided:\n- autodl image: \u003chttps://www.codewithgpu.com/i/lipku/metahuman-stream/base\u003e   \n[autodl Tutorial](https://livetalking-doc.readthedocs.io/en/latest/autodl/README.html)\n- ucloud image: \u003chttps://www.compshare.cn/images-detail?ImageID=compshareImage-18tpjhhxoq3j\u0026referral_code=3XW3852OBmnD089hMMrtuU\u0026ytag=GPU_livetalking1.3\u003e  \nAny port can be opened, and there is no need to deploy an srs service additionally.  \n[ucloud Tutorial](https://livetalking-doc.readthedocs.io/en/latest/ucloud/ucloud.html) \n\n\n## 5. TODO\n- [x] Added chatgpt to enable digital human dialogue\n- [x] Voice cloning\n- [x] Replace the digital human with a video when it is silent\n- [x] MuseTalk\n- [x] Wav2Lip\n- [x] Ultralight-Digital-Human\n\n---\nIf this project is helpful to you, please give it a star. Friends who are interested are also welcome to join in and improve this project together.\n* Knowledge Planet: https://t.zsxq.com/7NMyO, where high-quality common problems, best practice experiences, and problem solutions are accumulated.\n* WeChat Official Account: Digital Human Technology  \n![](https://mmbiz.qpic.cn/sz_mmbiz_jpg/l3ZibgueFiaeyfaiaLZGuMGQXnhLWxibpJUS2gfs8Dje6JuMY8zu2tVyU9n8Zx1yaNncvKHBMibX0ocehoITy5qQEZg/640?wxfrom=12\u0026tp=wxpic\u0026usePicPrefetch=1\u0026wx_fmt=jpeg\u0026amp;from=appmsg) ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flipku%2Flivetalking","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flipku%2Flivetalking","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flipku%2Flivetalking/lists"}