Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-avatar
📖 A curated list of resources dedicated to avatar.
https://github.com/Jason-cs18/awesome-avatar
Last synced: 5 days ago
JSON representation
-
Papers
-
3D Avatar (face+body)
- A Survey on 3D Human Avatar Modeling - From Reconstruction to Generation
- Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling - ov-file) ![Github stars](https://img.shields.io/github/stars/lizhe00/AnimatableGaussians.svg) ![Github forks](https://img.shields.io/github/forks/lizhe00/AnimatableGaussians.svg)||
- Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors - Dataset)||
- HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling
- AvatarReX: Real-time Expressive Full-body Avatars
- From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
- 4K4D: Real-Time 4D View Synthesis at 4K Resolution - time synthesis with 3DGS|
-
2D talking-face synthesis
- arXiv 2024 - trained on `160 hours` of cleaned video data (collected from internet)|
- SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation - cs18/awesome-avatar/blob/main/notes/sadtalker.md)|||
- Wav2Lip: Accurately Lip-sync Videos to Any Speech - sync model, bad video quality `96*96`, pre-trained on ~`180` hours video data from [LRS2](https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html)|
- Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis
- Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation - nju-cuhk/Talking-Face_PC-AVS) ![Github stars](https://img.shields.io/github/stars/Hangz-nju-cuhk/Talking-Face_PC-AVS.svg) ![Github forks](https://img.shields.io/github/forks/Hangz-nju-cuhk/Talking-Face_PC-AVS.svg)||contrastive learning on audio-lip|
- PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering
- StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN - fidenity synthesis via StyleGAN|
- VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild - retalking) ![Github stars](https://img.shields.io/github/stars/OpenTalker/video-retalking.svg) ![Github forks](https://img.shields.io/github/forks/OpenTalker/video-retalking.svg)|||
- DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video - sync and high-quality synthesis (`256*256`)|
- DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models - vilab/dreamtalk), ![Github stars](https://img.shields.io/github/stars/ali-vilab/dreamtalk.svg) ![Github forks](https://img.shields.io/github/forks/ali-vilab/dreamtalk.svg)||diffusion|
- MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
- LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control - expression|
- Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation - generative-vision/hallo), ![Github stars](https://img.shields.io/github/stars/fudan-generative-vision/hallo.svg) ![Github forks](https://img.shields.io/github/forks/fudan-generative-vision/hallo.svg)|✅|accurate lip-sync, diffusion, pre-trained on `264 hours` of cleaned video data (155 hours from internet and 9 hours from HDTF)|
-
Talking-body synthesis
- Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
- arXiv 2024 - trained on `200 hours` video data and more than `10k` unique identities|
- Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance - generative-vision/champ)![Github stars](https://img.shields.io/github/stars/fudan-generative-vision/champ.svg)![Github forks](https://img.shields.io/github/forks/fudan-generative-vision/champ.svg)||
- MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
- MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
- ControlNeXt: Powerful and Efficient Control for Image and Video Generation - research/ControlNeXt)![Github stars](https://img.shields.io/github/stars/dvlab-research/ControlNeXt.svg)![Github forks](https://img.shields.io/github/forks/dvlab-research/ControlNeXt.svg)|stable video diffusion|
- Video-to-Video Synthesis
- Everybody Dance Now
- Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
- MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model - research/magic-animate)![Github stars](https://img.shields.io/github/stars/magic-research/magic-animate.svg)![Github forks](https://img.shields.io/github/forks/magic-research/magic-animate.svg)||
-
Image and video generation
- Alias-Free Generative Adversarial Networks - of-stylegan/)|[Code](https://github.com/NVlabs/stylegan3)|high fidlity face generation|
- High-Resolution Image Synthesis with Latent Diffusion Models - 07-11-diffusion-models/)|[Code](https://github.com/CompVis/latent-diffusion)|diverse and high quality images|
- Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets - 04-12-diffusion-video/)|[Code](https://github.com/Stability-AI/generative-models)||
- Scalable Diffusion Models with Transformers - batch/a-new-class-of-diffusion-models-based-on-the-transformer-architecture/)|[Code](https://github.com/facebookresearch/DiT)|magic behind OpenAI Sora|
- Neural Discrete Representation Learning - E 2 and DALL-E 1 Explained](https://vaclavkosar.com/ml/openai-dall-e-2-and-dall-e-1)||magic behinds OpenAI DALL-E|
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis - pytorch)|3D synthesis via volume rendering|
- 3D Gaussian Splatting for Real-Time Radiance Field Rendering - comprehensive-overview-of-gaussian-splatting-e7d570081362)|[Code](https://github.com/graphdeco-inria/gaussian-splatting)|real-time 3d rendering|
-
3D talking-face synthesis
- AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis - NeRF)![Github stars](https://img.shields.io/github/stars/YudongGuo/AD-NeRF.svg)![Github forks](https://img.shields.io/github/forks/YudongGuo/AD-NeRF.svg)||
- Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
- GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
- Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis - NeRF)![Github stars](https://img.shields.io/github/stars/Fictionarry/ER-NeRF.svg)![Github forks](https://img.shields.io/github/forks/Fictionarry/ER-NeRF.svg)||
- TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
- GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
- SyncTalk: The Devil is in the Synchronization for Talking Head Synthesi
-
-
Researchers and labs
- Lumos SIGGRAPH Asia 2022
- NVIDIA Research
- vid2vid NeurIPS'18 - vid2vid NeurIPS'19](https://nvlabs.github.io/few-shot-vid2vid/), [EG3D CVPR'22](https://github.com/NVlabs/eg3d);
- face-vid2vid CVPR'21 - shot
- DreamPose ICCV'23
- Avatar Fingerprinting arXiv'23
- Aliaksandr Siarohin @ Snap Research
- Unsupervised-Volumetric-Animation CVPR'23 - SGAN ECCV'22](https://arxiv.org/abs/2112.01422), [Articulated-Animation CVPR'21](https://arxiv.org/abs/2104.11280), [Monkey-Net CVPR'19](https://arxiv.org/abs/1812.08861), [FOMM NeurIPS'19](http://papers.nips.cc/paper/8935-first-order-motion-model-for-image-animation);
- Ziwei Liu @ Nanyang Technological University
- StyleSync CVPR'23 - CAT SIGGRAPH Asia 2022](https://hangz-nju-cuhk.github.io/projects/AV-CAT), [StyleGANX ICCV'23](https://www.mmlab-ntu.com/project/styleganex/), [StyleSwap ECCV'22](https://hangz-nju-cuhk.github.io/projects/StyleSwap), [PC-AVS CVPR'21](https://hangz-nju-cuhk.github.io/projects/PC-AVS), [Speech2Talking-Face IJCAI'21](https://www.ijcai.org/proceedings/2021/0141.pdf), [VToonify SIGGRAPH Asia 2022](https://www.youtube.com/watch?v=0_OmVhDgYuY);
- MotionDiffuse arXiv'22
- Relighting4D ECCV'22
- Xiaodong Cun @ Tencent AI Lab
- StyleHEAT ECCV'22
- LivelySpeaker ICCV'23
- Gordon Wetzstein @ Stanford University
- SSIF SIGGRAPH'23 - ->
- FLAME SIGGRAPH Asia 2017
Categories
Sub Categories
Keywords
virtualhumans
1
lip-sync
1
video-editing
1
talking-head-videos
1
siggraph-asia-2022
1
lip-synchronization
1
video-generation
1
musev
1
infinite-length
1
image2video
1
human-video-generation
1
diffusion
1
jekyll-theme
1
jekyll
1
homepage
1
academic-website
1
animatable-avatar
1
3d-reconstruction
1
3d-human
1
3d-gaussian-splatting
1