{"id":13603184,"url":"https://github.com/yerfor/Real3DPortrait","last_synced_at":"2025-04-11T13:33:11.856Z","repository":{"id":220468292,"uuid":"751718568","full_name":"yerfor/Real3DPortrait","owner":"yerfor","description":"Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code","archived":false,"fork":false,"pushed_at":"2024-10-18T09:11:18.000Z","size":21749,"stargazers_count":1019,"open_issues_count":35,"forks_count":120,"subscribers_count":26,"default_branch":"main","last_synced_at":"2025-04-06T15:04:07.768Z","etag":null,"topics":["nerf","one-shot","talking-face-generation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yerfor.png","metadata":{"files":{"readme":"README-zh.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-02T07:08:40.000Z","updated_at":"2025-04-01T01:57:46.000Z","dependencies_parsed_at":"2024-11-06T22:34:33.135Z","dependency_job_id":null,"html_url":"https://github.com/yerfor/Real3DPortrait","commit_stats":{"total_commits":32,"total_committers":10,"mean_commits":3.2,"dds":0.5,"last_synced_commit":"a9d70c7ee81bc9349f0f5e558f406f75194bd776"},"previous_names":["yerfor/real3dportrait"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yerfor%2FReal3DPortrait","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yerfor%2FReal3DPortrait/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yerfor%2FReal3DPortrait/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yerfor%2FReal3DPortrait/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yerfor","download_url":"https://codeload.github.com/yerfor/Real3DPortrait/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248409841,"owners_count":21098771,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nerf","one-shot","talking-face-generation"],"created_at":"2024-08-01T18:01:55.937Z","updated_at":"2025-04-11T13:33:06.833Z","avatar_url":"https://github.com/yerfor.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis | ICLR 2024 Spotlight\n[![arXiv](https://img.shields.io/badge/arXiv-Paper-%3CCOLOR%3E.svg)](https://arxiv.org/abs/2401.08503)| [![GitHub Stars](https://img.shields.io/github/stars/yerfor/Real3DPortrait\n)](https://github.com/yerfor/Real3DPortrait) | [English Readme](./README.md)\n\n这个仓库是Real3D-Portrait的官方PyTorch实现, 用于实现单参考图(one-shot)、高视频真实度(video reality)的虚拟人视频合成。您可以访问我们的[项目页面](https://real3dportrait.github.io/)以观看Demo视频, 阅读我们的[论文](https://arxiv.org/pdf/2401.08503.pdf)以了解技术细节。\n\n\u003cp align=\"center\"\u003e\n    \u003cbr\u003e\n    \u003cimg src=\"assets/real3dportrait.png\" width=\"100%\"/\u003e\n    \u003cbr\u003e\n\u003c/p\u003e\n\n## 您可能同样感兴趣\n- 我们发布了GeneFace++([https://github.com/yerfor/GeneFacePlusPlus](https://github.com/yerfor/GeneFacePlusPlus)), 一个专注于提升单个特定说话人效果的说话人合成系统，它实现了高嘴形对齐、高视频质量和高系统效率。\n\n\n# 快速上手！\n## 安装环境\n请参照[环境配置文档](docs/prepare_env/install_guide-zh.md)，配置Conda环境`real3dportrait`\n## 下载预训练与第三方模型\n### 3DMM BFM模型\n下载3DMM BFM模型：[Google Drive](https://drive.google.com/drive/folders/1o4t5YIw7w4cMUN4bgU9nPf6IyWVG1bEk?usp=sharing) 或 [BaiduYun Disk](https://pan.baidu.com/s/1aqv1z_qZ23Vp2VP4uxxblQ?pwd=m9q5 ) 提取码: m9q5\n\n\n下载完成后，放置全部的文件到`deep_3drecon/BFM`里，文件结构如下：\n```\ndeep_3drecon/BFM/\n├── 01_MorphableModel.mat\n├── BFM_exp_idx.mat\n├── BFM_front_idx.mat\n├── BFM_model_front.mat\n├── Exp_Pca.bin\n├── facemodel_info.mat\n├── index_mp468_from_mesh35709.npy\n├── mediapipe_in_bfm53201.npy\n└── std_exp.txt\n```\n\n### 预训练模型\n下载预训练的Real3D-Portrait：[Google Drive](https://drive.google.com/drive/folders/1MAveJf7RvJ-Opg1f5qhLdoRoC_Gc6nD9?usp=sharing) 或 [BaiduYun Disk](https://pan.baidu.com/s/1Mjmbn0UtA1Zm9owZ7zWNgQ?pwd=6x4f ) 提取码: 6x4f\n  \n下载完成后，放置全部的文件到`checkpoints`里并解压，文件结构如下：\n```\ncheckpoints/\n├── 240210_real3dportrait_orig\n│   ├── audio2secc_vae\n│   │   ├── config.yaml\n│   │   └── model_ckpt_steps_400000.ckpt\n│   └── secc2plane_torso_orig\n│       ├── config.yaml\n│       └── model_ckpt_steps_100000.ckpt\n└── pretrained_ckpts\n    └── mit_b0.pth\n```\n\n## 推理测试\n我们目前提供了**命令行（CLI）**, **Gradio WebUI**与**Google Colab**推理方式。我们同时支持音频驱动（Audio-Driven）与视频驱动（Video-Driven）：\n\n- 音频驱动场景下，需要至少提供`source image`与`driving audio`\n- 视频驱动场景下，需要至少提供`source image`与`driving expression video`\n\n### Gradio WebUI推理\n启动Gradio WebUI，按照提示上传素材，点击`Generate`按钮即可推理：\n```bash\npython inference/app_real3dportrait.py\n```\n\n### Google Colab推理\n运行这个[Colab](https://colab.research.google.com/github/yerfor/Real3DPortrait/blob/main/inference/real3dportrait_demo.ipynb)中的所有cell。\n\n### 命令行推理\n首先，切换至项目根目录并启用Conda环境：\n```bash\ncd \u003cReal3DPortraitRoot\u003e\nconda activate real3dportrait\nexport PYTHON_PATH=./\n```\n音频驱动场景下，需要至少提供source image与driving audio，推理指令：\n```bash\npython inference/real3d_infer.py \\\n--src_img \u003cPATH_TO_SOURCE_IMAGE\u003e \\\n--drv_aud \u003cPATH_TO_AUDIO\u003e \\\n--drv_pose \u003cPATH_TO_POSE_VIDEO, OPTIONAL\u003e \\\n--bg_img \u003cPATH_TO_BACKGROUND_IMAGE, OPTIONAL\u003e \\\n--out_name \u003cPATH_TO_OUTPUT_VIDEO, OPTIONAL\u003e\n```\n视频驱动场景下，需要至少提供source image与driving expression video（作为drv_aud参数），推理指令：\n```bash\npython inference/real3d_infer.py \\\n--src_img \u003cPATH_TO_SOURCE_IMAGE\u003e \\\n--drv_aud \u003cPATH_TO_EXP_VIDEO\u003e \\\n--drv_pose \u003cPATH_TO_POSE_VIDEO, OPTIONAL\u003e \\\n--bg_img \u003cPATH_TO_BACKGROUND_IMAGE, OPTIONAL\u003e \\\n--out_name \u003cPATH_TO_OUTPUT_VIDEO, OPTIONAL\u003e\n```\n一些可选参数注释：\n- `--drv_pose` 指定时提供了运动pose信息，不指定则为静态运动\n- `--bg_img` 指定时提供了背景信息，不指定则为source image提取的背景\n- `--mouth_amp` 嘴部张幅参数，值越大张幅越大\n- `--map_to_init_pose` 值为`True`时，首帧的pose将被映射到source pose，后续帧也作相同变换\n- `--temperature` 代表audio2motion的采样温度，值越大结果越多样，但同时精确度越低\n- `--out_name` 不指定时，结果将保存在`infer_out/tmp/`中\n- `--out_mode` 值为`final`时，只输出说话人视频；值为`concat_debug`时，同时输出一些可视化的中间结果\n\n指令示例：\n```bash\npython inference/real3d_infer.py \\\n--src_img data/raw/examples/Macron.png \\\n--drv_aud data/raw/examples/Obama_5s.wav \\\n--drv_pose data/raw/examples/May_5s.mp4 \\\n--bg_img data/raw/examples/bg.png \\\n--out_name output.mp4 \\\n--out_mode concat_debug\n```\n\n## ToDo\n- [x] **Release Pre-trained weights of Real3D-Portrait.**\n- [x] **Release Inference Code of Real3D-Portrait.**\n- [x] **Release Gradio Demo of Real3D-Portrait..**\n- [x] **Release Google Colab of Real3D-Portrait..**\n- [ ] **Release Training Code of Real3D-Portrait.**\n\n# 引用我们\n如果这个仓库对你有帮助，请考虑引用我们的工作：\n```\n@article{ye2024real3d,\n  title={Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis},\n  author={Ye, Zhenhui and Zhong, Tianyun and Ren, Yi and Yang, Jiaqi and Li, Weichuang and Huang, Jiawei and Jiang, Ziyue and He, Jinzheng and Huang, Rongjie and Liu, Jinglin and others},\n  journal={arXiv preprint arXiv:2401.08503},\n  year={2024}\n}\n@article{ye2023geneface++,\n  title={GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation},\n  author={Ye, Zhenhui and He, Jinzheng and Jiang, Ziyue and Huang, Rongjie and Huang, Jiawei and Liu, Jinglin and Ren, Yi and Yin, Xiang and Ma, Zejun and Zhao, Zhou},\n  journal={arXiv preprint arXiv:2305.00787},\n  year={2023}\n}\n@article{ye2023geneface,\n  title={GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis},\n  author={Ye, Zhenhui and Jiang, Ziyue and Ren, Yi and Liu, Jinglin and He, Jinzheng and Zhao, Zhou},\n  journal={arXiv preprint arXiv:2301.13430},\n  year={2023}\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyerfor%2FReal3DPortrait","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyerfor%2FReal3DPortrait","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyerfor%2FReal3DPortrait/lists"}