awesome-avatar

📖 A curated list of resources dedicated to avatar.
https://github.com/Jason-cs18/awesome-avatar

Last synced: 17 days ago
JSON representation

Papers
- Image and video generation
  - EG3D: Efficient Geometry-aware 3D Generative Adversarial Networks - aware GAN|
  - Alias-Free Generative Adversarial Networks - of-stylegan/)|[Code](https://github.com/NVlabs/stylegan3)|high fidlity face generation|
  - High-Resolution Image Synthesis with Latent Diffusion Models - 07-11-diffusion-models/)|[Code](https://github.com/CompVis/latent-diffusion)|diverse and high quality images|
  - Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets - 04-12-diffusion-video/)|[Code](https://github.com/Stability-AI/generative-models)||
  - Scalable Diffusion Models with Transformers - batch/a-new-class-of-diffusion-models-based-on-the-transformer-architecture/)|[Code](https://github.com/facebookresearch/DiT)|magic behind OpenAI Sora|
  - Neural Discrete Representation Learning - E 2 and DALL-E 1 Explained](https://vaclavkosar.com/ml/openai-dall-e-2-and-dall-e-1)||magic behinds OpenAI DALL-E|
  - NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis - pytorch)|3D synthesis via volume rendering|
  - 3D Gaussian Splatting for Real-Time Radiance Field Rendering - comprehensive-overview-of-gaussian-splatting-e7d570081362)|[Code](https://github.com/graphdeco-inria/gaussian-splatting)|real-time 3d rendering|
- 3D Avatar (face+body)
  - A Survey on 3D Human Avatar Modeling - From Reconstruction to Generation
  - Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling - ov-file) ![Github stars](https://img.shields.io/github/stars/lizhe00/AnimatableGaussians.svg) ![Github forks](https://img.shields.io/github/forks/lizhe00/AnimatableGaussians.svg)||
  - Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors - Dataset)||
  - HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling
  - AvatarReX: Real-time Expressive Full-body Avatars
  - 4K4D: Real-Time 4D View Synthesis at 4K Resolution - time synthesis with 3DGS|
- 2D talking-face synthesis
  - arXiv 2024 - trained on `160 hours` of cleaned video data (collected from internet)|
  - SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation - cs18/awesome-avatar/blob/main/notes/sadtalker.md)|||
  - Wav2Lip: Accurately Lip-sync Videos to Any Speech - sync model, bad video quality `96*96`, pre-trained on ~`180` hours video data from [LRS2](https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html)|
  - Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis
  - Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation - nju-cuhk/Talking-Face_PC-AVS) ![Github stars](https://img.shields.io/github/stars/Hangz-nju-cuhk/Talking-Face_PC-AVS.svg) ![Github forks](https://img.shields.io/github/forks/Hangz-nju-cuhk/Talking-Face_PC-AVS.svg)||contrastive learning on audio-lip|
  - PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering
  - StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN - fidenity synthesis via StyleGAN|
  - VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild - retalking) ![Github stars](https://img.shields.io/github/stars/OpenTalker/video-retalking.svg) ![Github forks](https://img.shields.io/github/forks/OpenTalker/video-retalking.svg)|||
  - DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video - sync and high-quality synthesis (`256*256`)|
  - DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models - vilab/dreamtalk), ![Github stars](https://img.shields.io/github/stars/ali-vilab/dreamtalk.svg) ![Github forks](https://img.shields.io/github/forks/ali-vilab/dreamtalk.svg)||diffusion|
  - MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
  - LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control - expression|
  - Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation - generative-vision/hallo), ![Github stars](https://img.shields.io/github/stars/fudan-generative-vision/hallo.svg) ![Github forks](https://img.shields.io/github/forks/fudan-generative-vision/hallo.svg)|✅|accurate lip-sync, diffusion, pre-trained on `264 hours` of cleaned video data (155 hours from internet and 9 hours from HDTF)|
- Talking-body synthesis
  - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
  - arXiv 2024 - trained on `200 hours` video data and more than `10k` unique identities|
  - Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance - generative-vision/champ)![Github stars](https://img.shields.io/github/stars/fudan-generative-vision/champ.svg)![Github forks](https://img.shields.io/github/forks/fudan-generative-vision/champ.svg)||
  - MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
  - MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
  - ControlNeXt: Powerful and Efficient Control for Image and Video Generation - research/ControlNeXt)![Github stars](https://img.shields.io/github/stars/dvlab-research/ControlNeXt.svg)![Github forks](https://img.shields.io/github/forks/dvlab-research/ControlNeXt.svg)|stable video diffusion|
  - Video-to-Video Synthesis
  - Everybody Dance Now
  - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
  - MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model - research/magic-animate)![Github stars](https://img.shields.io/github/stars/magic-research/magic-animate.svg)![Github forks](https://img.shields.io/github/forks/magic-research/magic-animate.svg)||
- 3D talking-face synthesis
  - AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis - NeRF)![Github stars](https://img.shields.io/github/stars/YudongGuo/AD-NeRF.svg)![Github forks](https://img.shields.io/github/forks/YudongGuo/AD-NeRF.svg)||
  - Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
  - GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
  - Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis - NeRF)![Github stars](https://img.shields.io/github/stars/Fictionarry/ER-NeRF.svg)![Github forks](https://img.shields.io/github/forks/Fictionarry/ER-NeRF.svg)||
  - TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
  - GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
  - SyncTalk: The Devil is in the Synchronization for Talking Head Synthesi
Researchers and labs
- Lumos SIGGRAPH Asia 2022
- NVIDIA Research
- vid2vid NeurIPS'18 - vid2vid NeurIPS'19](https://nvlabs.github.io/few-shot-vid2vid/), [EG3D CVPR'22](https://github.com/NVlabs/eg3d);
- face-vid2vid CVPR'21 - shot
- DreamPose ICCV'23
- Avatar Fingerprinting arXiv'23
- Aliaksandr Siarohin @ Snap Research
- Unsupervised-Volumetric-Animation CVPR'23 - SGAN ECCV'22](https://arxiv.org/abs/2112.01422), [Articulated-Animation CVPR'21](https://arxiv.org/abs/2104.11280), [Monkey-Net CVPR'19](https://arxiv.org/abs/1812.08861), [FOMM NeurIPS'19](http://papers.nips.cc/paper/8935-first-order-motion-model-for-image-animation);
- Ziwei Liu @ Nanyang Technological University
- StyleSync CVPR'23 - CAT SIGGRAPH Asia 2022](https://hangz-nju-cuhk.github.io/projects/AV-CAT), [StyleGANX ICCV'23](https://www.mmlab-ntu.com/project/styleganex/), [StyleSwap ECCV'22](https://hangz-nju-cuhk.github.io/projects/StyleSwap), [PC-AVS CVPR'21](https://hangz-nju-cuhk.github.io/projects/PC-AVS), [Speech2Talking-Face IJCAI'21](https://www.ijcai.org/proceedings/2021/0141.pdf), [VToonify SIGGRAPH Asia 2022](https://www.youtube.com/watch?v=0_OmVhDgYuY);
- MotionDiffuse arXiv'22
- Relighting4D ECCV'22
- Xiaodong Cun @ Tencent AI Lab
- StyleHEAT ECCV'22
- LivelySpeaker ICCV'23
- Gordon Wetzstein @ Stanford University
- SSIF SIGGRAPH'23 - ->
- FLAME SIGGRAPH Asia 2017

Programming Languages

Python 11 HTML 2 SCSS 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

awesome-avatar

Papers

Image and video generation

3D Avatar (face+body)

2D talking-face synthesis

Talking-body synthesis

3D talking-face synthesis

Researchers and labs