{"id":13584180,"url":"https://github.com/yule-li/Human-Video-Generation","last_synced_at":"2025-04-07T01:31:33.427Z","repository":{"id":43029487,"uuid":"195186209","full_name":"yule-li/Human-Video-Generation","owner":"yule-li","description":"Human Video Generation Paper List","archived":false,"fork":false,"pushed_at":"2024-03-02T02:45:32.000Z","size":2859,"stargazers_count":458,"open_issues_count":1,"forks_count":52,"subscribers_count":38,"default_branch":"master","last_synced_at":"2024-11-06T01:39:27.287Z","etag":null,"topics":["3d","gan","generation","human","video"],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yule-li.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-07-04T06:57:26.000Z","updated_at":"2024-11-04T04:29:14.000Z","dependencies_parsed_at":"2024-02-18T03:21:10.694Z","dependency_job_id":"bea3389c-649e-4842-b06a-8b098c115398","html_url":"https://github.com/yule-li/Human-Video-Generation","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yule-li%2FHuman-Video-Generation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yule-li%2FHuman-Video-Generation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yule-li%2FHuman-Video-Generation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yule-li%2FHuman-Video-Generation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yule-li","download_url":"https://codeload.github.com/yule-li/Human-Video-Generation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247577913,"owners_count":20961193,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d","gan","generation","human","video"],"created_at":"2024-08-01T15:04:04.096Z","updated_at":"2025-04-07T01:31:28.413Z","avatar_url":"https://github.com/yule-li.png","language":null,"funding_links":[],"categories":["Others"],"sub_categories":[],"readme":"# Human Video Generation \n## Paper List\n### 2018\n- **Face2Face**: \"Real-time Face Capture and Reenactment of RGB Videos\" \"CVPR\" (2016) [[paper](https://web.stanford.edu/~zollhoef/papers/CVPR2016_Face2Face/paper.pdf)][[project](https://web.stanford.edu/~zollhoef/papers/CVPR2016_Face2Face/page.html)]\n- **PSGAN**: \"Pose Guided Human Video Generation\" \"ECCV\" (2018) [[paper](http://openaccess.thecvf.com/content_ECCV_2018/papers/Ceyuan_Yang_Pose_Guided_Human_ECCV_2018_paper.pdf)]\n- **DVP**: \"Deep Video Portraits\" \"Siggraph\"(2018) [[paper](https://web.stanford.edu/~zollhoef/papers/SG2018_DeepVideo/paper.pdf)][[project](https://web.stanford.edu/~zollhoef/papers/SG2018_DeepVideo/page.html)]\n- **Recycle-GAN**: \"Recycle-GAN: Unsupervised Video Retargeting\" \"ECCV\"(2018) [[paper](https://www.cs.cmu.edu/~aayushb/Recycle-GAN/recycle_gan.pdf)][[project](https://www.cs.cmu.edu/~aayushb/Recycle-GAN/)][[code](https://github.com/aayushbansal/Recycle-GAN)]\n- **X2Face**: \"X2Face: A network for controlling face generation by using images, audio, and pose codes\" \"ECCV\"(2018) [[paper](http://www.robots.ox.ac.uk/~vgg/publications/2018/Wiles18/wiles18.pdf)][[project](http://www.robots.ox.ac.uk/~vgg/research/unsup_learn_watch_faces/x2face.html)][[code](https://github.com/oawiles/X2Face)]\n- **EBDN**: \"Everybody Dance Now\" \"arXiv\"(2018) [[paper](https://arxiv.org/pdf/1808.07371.pdf)][[project](https://carolineec.github.io/everybody_dance_now/)]\n- **Vid2Vid**: \"Video-to-Video Synthesis\" \"NIPS\"(2018) [[paper](https://tcwang0509.github.io/vid2vid/paper_vid2vid.pdf)][[project](https://tcwang0509.github.io/vid2vid/)][[code](https://github.com/NVIDIA/vid2vid)]\n### 2019\n- **NAR**: \"Neural Animation and Reenactment of Human Actor Videos\" \"Siggraph\"(2019) [[paper](https://arxiv.org/abs/1809.03658)][[project](http://gvv.mpi-inf.mpg.de/projects/wxu/HumanReenactment/)]\n- **TETH**: \"Text-based Editing of Talking-head Video\" \"Siggraph\"(2019) [[paper](https://www.ohadf.com/projects/text-based-editing/data/text-based-editing.pdf)][[project](https://www.ohadf.com/projects/text-based-editing/)]\n- **VPC**: \"Deep Video-Based Performance Cloning\" \"Eurographics\"(2019) [[paper](https://arxiv.org/abs/1808.06847)]\n- **FSTH**: \"Few-Shot Adversarial Learning of Realistic Neural Talking Head Models\" \"CVPR\"(2019) [[paper](https://arxiv.org/pdf/1905.08233.pdf)][[code unofficial](https://github.com/grey-eye/talking-heads)]\n- **TNA**: \"Textured Neural Avatars\" \"CVPR\"(2019) [[paper](https://arxiv.org/abs/1905.08776)][[project](https://saic-violet.github.io/texturedavatar/)]\n- **VOCA**: \"Voice Operated Character Animation\" \"CVPR\"(2019) [[paper](https://ps.is.tuebingen.mpg.de/uploads_file/attachment/attachment/510/paper_final.pdf)][[project](https://voca.is.tue.mpg.de/)][[code](https://github.com/TimoBolkart/voca)]\n- **Audio2Face**: \"Audio2Face: Generating Speech/Face Animation from Single Audio with Attention-Based Bidirectional LSTM Networks\" \"arXiv\"(2019) [[paper](https://arxiv.org/abs/1905.11142)\n- **RSDA**: \"Realistic Speech-Driven Animation with GANs\" \"arXiv\"(2019) [[paper](https://arxiv.org/abs/1906.06337)][[project](https://sites.google.com/view/facial-animation)][[code](https://github.com/DinoMan/speech-driven-animation)]\n- **LISCG**: \"Learning Individual Styles of Conversational Gesture\" \"arXiv\"(2019) [[paper](https://arxiv.org/abs/1906.04160)] [[project](http://people.eecs.berkeley.edu/~shiry/projects/speech2gesture/)][[code](https://github.com/amirbar/speech2gesture)]\n- **AUDIO2FACE**: \"EAUDIO2FACE: GENERATING SPEECH/FACE ANIMATION FROM SINGLE AUDIO WITH ATTENTION-BASED BIDIRECTIONAL LSTM NETWORKS\" \"ICMI\"(2019)\n- **AvatarSim**: \"A High-Fidelity Open Embodied Avatar with Lip Syncing and Expression Capabilities\" \"ICMI\"(2019) [code](https://github.com/danmcduff/AvatarSim)\n- **NVP**: \"Neural Voice Puppetry: Audio-driven Facial Reenactment\" \"arXiv\"(2019) [[paper](https://arxiv.org/pdf/1912.05566.pdf)]\n- **CSGN**: \"Convolutional Sequence Generation for Skeleton-Based Action Synthesis\" \"ICCV\"(2019) [[paper](http://yjxiong.me/papers/iccv19csgn.pdf)]\n- **Few shot VID2VID**: \"Few-shot Video-to-Video Synthesis\" [[paper](https://nvlabs.github.io/few-shot-vid2vid/main.pdf)] [[project](https://nvlabs.github.io/few-shot-vid2vid/)] [[code](https://github.com/NVlabs/few-shot-vid2vid)]\n- **FOM**: \"First Order Motion Model for Image Animation\" \"NIPS\"(2019) [[paper](http://papers.nips.cc/paper/8935-first-order-motion-model-for-image-animation.pdf)] [[project](https://aliaksandrsiarohin.github.io/first-order-model-website/)] [[code](https://github.com/AliaksandrSiarohin/first-order-model)]\n### 2020\n- **TransMoMo**: \"TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting\" \"CVPR\"(2020) [[paper](https://arxiv.org/pdf/2003.14401.pdf)] [[project](https://yzhq97.github.io/transmomo/)] [[code](https://github.com/yzhq97/transmomo.pytorch)]\n- **poseflow**: \"Deep Image Spatial Transformation for Person Image Generation\" \"CVPR\"(2020) [[paper](https://arxiv.org/abs/2003.00696)] [[project](https://renyurui.github.io/GFLA-web/)] [[code](https://github.com/RenYurui/Global-Flow-Local-Attention)]\n- **PIFuHD**: \"PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization\" \"CVPR(Oral)\"(2020) [[paper](https://arxiv.org/pdf/2004.00452.pdf)] [[project](https://shunsukesaito.github.io/PIFuHD/)] [[code](https://github.com/facebookresearch/pifuhd)]\n- **Hifi3dface**: \"High-Fidelity 3D Digital Human Creation from RGB-D Selfies\" \"arXiv\"(2020.10) [[paper](https://arxiv.org/pdf/2010.05562.pdf)][[project](https://tencent-ailab.github.io/hifi3dface_projpage/)] [[code](https://github.com/tencent-ailab/hifi3dface)]\n- **face-vid2vid**: \"One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing\" \"arXiv\"(2020.11) [[paper](https://arxiv.org/abs/2011.15126)] [[project](https://nvlabs.github.io/face-vid2vid/)] [[code](https://github.com/NVlabs/face-vid2vid)]\n- **HeadGan**: \"HeadGAN: Video-and-Audio-Driven Talking Head Synthesis\" \"arXiv\"(2020.12) [[paper](https://arxiv.org/pdf/2012.08261.pdf)]\n- \"Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose\" \"arXiv\"(2020) [[paper](http://arxiv.org/abs/2002.10137)][[code](https://github.com/yiranran/Audio-driven-TalkingFace-HeadPose)]\n\n### 2021\n- **Talking-Face_PC-AVS**: \"Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation\" \"CVPR\"(2021) [[code](https://github.com/Hangz-nju-cuhk/Talking-Face_PC-AVS)][[project](https://hangz-nju-cuhk.github.io/projects/PC-AVS)][[demo](https://www.youtube.com/watch?v=lNQQHIggnUg)]\n- **Pixel Codec Avatar** \"Pixel Codec Avatars\" \"arXiv\"(2021.04) [[paper](https://arxiv.org/pdf/2104.04638.pdf)]\n- **MRAA** \"Motion Representations for Articulated Animation\"  \"CVPR\"(2021) [[project](https://aliaksandrsiarohin.github.io/motion-representation-website/)]\n- **NWT** \"Towards natural audio-to-video generation with representation learning\" \"arXiv\"(2021)[[paper](https://arxiv.org/pdf/2106.04283.pdf)][[project](https://next-week-tonight.github.io/NWT/)]\n- **LipSync3D** Data-Efficient Learning of Personalized 3D Talking Faces from Video using Pose and Lighting Normalization \"arXiv\"(2021) [[paper](https://arxiv.org/pdf/2106.04185.pdf)][[demo](https://www.youtube.com/watch?v=L1StbX9OznY)]\n- **AD-NeRF** Audio Driven Neural Radiance Fields for Talking Head Synthesis \"ICCV\"(2021) [[paper](https://arxiv.org/abs/2103.11078)][[code](https://github.com/YudongGuo/AD-NeRF)][[demo](https://www.youtube.com/watch?v=TQO2EBYXLyU)][[project](https://yudongguo.github.io/ADNeRF/)]\n- **LSP** Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation [[paper](https://yuanxunlu.github.io/projects/LiveSpeechPortraits/resources/SIGGRAPH_Asia_2021__Live_Speech_Portraits__Real_Time_Photorealistic_Talking_Head_Animation.pdf)][[code](https://github.com/YuanxunLu/LiveSpeechPortraits)][[project](https://yuanxunlu.github.io/projects/LiveSpeechPortraits/)][[demo](https://yuanxunlu.github.io/projects/LiveSpeechPortraits/resources/[Compressed]SIGGRAPHAsia21_LiveSpeechPortraits.mp4)]\n- **FaceFormer** FaceFormer: Speech-Driven 3D Facial Animation with Transformers \"arXiv\"(2021.12) [[paper](https://arxiv.org/pdf/2112.05329.pdf)]\n- **HeadNeRF** HeadNeRF: A Real-time NeRF-based Parametric Head Model \"arXiv\"(2021.12) [[paper](https://arxiv.org/pdf/2112.05637.pdf)][[project](https://hy1995.top/HeadNeRF-Project/)]\n- **FACIAL** FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning \"ICCV\"(2021) [[paper](https://arxiv.org/abs/2108.07938)][[code](https://github.com/zhangchenxu528/FACIAL)]\n\n### 2022\n- **NPFAP** Video-driven Neural Physically-based Facial Asset for Production \"arXiv\"(2022.02)[[paper](https://arxiv.org/pdf/2202.05592.pdf)]\n- **PGMPI** Real-Time Neural Character Rendering with Pose-Guided Multiplane Images \"ECCV\"(2022) [[paper](https://arxiv.org/pdf/2204.11820.pdf)][[code](https://github.com/ken-ouyang/PGMPI)][[project](https://ken-ouyang.github.io/cmpi/index.html)]\n- **VideoReTalking** Audio-based Lip Synchronization for Talking Head Video Editing In the Wild \"arXiv\"(2022.11) [[paper](https://arxiv.org/abs/2211.14758)][[code](https://github.com/vinthony/video-retalking)][[project](https://vinthony.github.io/video-retalking/)]\n- **One-Shot-Talking-Face** One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning \"AAAI\"(2022) [[paper](https://arxiv.org/pdf/2112.02749.pdf)][[code](https://github.com/FuxiVirtualHuman/AAAI22-one-shot-talking-face)][[demo](https://www.youtube.com/watch?v=HHj-XCXXePY)]\n- RAD-NeRF: Real-time Neural Talking Portrait Synthesis:\"arXiv\"(2022.12)[[paper](https://arxiv.org/pdf/2211.12368.pdf)][[code](https://github.com/ashawkey/RAD-NeRF)]\n\n### 2023 \n- **SadTalker** Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation \"CVPR\"(2023) [[paper](https://arxiv.org/abs/2211.12194)][[code](https://github.com/Winfredy/SadTalker)][[project](https://sadtalker.github.io/)]\n- **GeneFace**: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis \"ICLRv\"(2023) [[project](https://genefaceplusplus.github.io/)][[code](https://github.com/yerfor/GeneFace)][[dockerfile](https://github.com/xk-huang/GeneFace/tree/main/docker)]\n- Towards Realistic Generative 3D Face Models \"arXiv\"(2023.04) [[paper](https://arxiv.org/pdf/2304.12483.pdf)][[project](https://aashishrai3799.github.io/Towards-Realistic-Generative-3D-Face-Models/)][[code](https://github.com/aashishrai3799/Towards-Realistic-Generative-3D-Face-Models/)]\n- **Live 3D Portrait**: Real-Time Radiance Fields for Single-Image Portrait View Synthesis \"SIGGRAPH\" (2023) [[project](https://research.nvidia.com/labs/nxp/lp3d/)][[paper](https://research.nvidia.com/labs/nxp/lp3d/media/paper.pdf)]\n- **StyleAvatar**: Real-time Photo-realistic Portrait Avatar from a Single Video  \"SIGGRAPH\" (2023) [[code](https://github.com/LizhenWangT/StyleAvatar)][[project](https://www.liuyebin.com/styleavatar/styleavatar.html)][[paper](https://www.liuyebin.com/styleavatar/assets/StyleAvatar.pdf)]\n- **OTAvatar** : One-shot Talking Face Avatar with Controllable Tri-plane Rendering [[code](https://github.com/theEricMa/OTAvatar)] \"arXiv\"(2023) [[paper](https://arxiv.org/pdf/2303.14662.pdf)]\n- **DisCoHead**: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions \"arXiv\"(2023) [[project](https://deepbrainai-research.github.io/discohead/)]\n- **GeneFace++**: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation:[[project](https://genefaceplusplus.github.io/)]\n- **HumanRF**: High-Fidelity Neural Radiance Fields for Humans in Motion \"SIGGRAPH\" (2023) [[project](https://synthesiaresearch.github.io/humanrf/)][[code](https://github.com/synthesiaresearch/humanrf)]\n- **PointAvatar**: Deformable Point-based Head Avatars from Videos \"CVPR\"(2023) [[project](https://zhengyuf.github.io/PointAvatar/)][[code](https://github.com/zhengyuf/pointavatar)][[paper](https://arxiv.org/abs/2212.08377)]\n- **SyncTalk**:SyncTalk: The Devil😈 is in the Synchronization for Talking Head Synthesis \"arXiv(2023.11)[[project](https://ziqiaopeng.github.io/synctalk/)][[code](https://github.com/ziqiaopeng/SyncTalk)]\n\n### 2024\n- **Real3D-Portait**: Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis \"ICLR(2024)\" [[project](https://real3dportrait.github.io/)][[code](https://github.com/yerfor/Real3DPortrait)][[paper](https://arxiv.org/abs/2401.08503)]\n- **EMO**: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions \"arXiv\"(2024.02) [[project](https://humanaigc.github.io/emote-portrait-alive/)][[paper](https://arxiv.org/abs/2402.17485)][[code](https://github.com/HumanAIGC/EMO)]\n\n\n\n\n## Applications\n### Face Swap\n- ZAO: a hot app.\n\n[![Video generated based on ZAO](https://img.youtube.com/vi/m0u68w2H7_Y/0.jpg)](https://www.youtube.com/watch?v=m0u68w2H7_Y)\n### AI Host: \n\n[![Video generated by SouGou](./images/AI-host.png)](https://m.weibo.cn/status/4403475372638235?wm=3333_2001\u0026from=1097193010\u0026sourcetype=dingding)\n## Dataset\n\n## Researchers \u0026 Teams\n\n1. [Graphics, Vision \u0026 Video at MPII](http://gvv.mpi-inf.mpg.de/)\n2. [REAL VIRTUAL HUMANS at MPII](https://virtualhumans.mpi-inf.mpg.de/)\n3. [Visual Computing Group at TUM](http://www.niessnerlab.org/index.html)\n4. [Perceiving Systems Department at MPII](https://ps.is.tuebingen.mpg.de/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyule-li%2FHuman-Video-Generation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyule-li%2FHuman-Video-Generation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyule-li%2FHuman-Video-Generation/lists"}