awesome-ai-talking-heads

A curated list of 'Talking Head Generation' resources. Features influential papers, groundbreaking algorithms, crucial GitHub repositories, insightful videos, and more. Ideal for AI enthusiasts, researchers, and graphics professionals
https://github.com/Curated-Awesome-Lists/awesome-ai-talking-heads

Last synced: 3 days ago
JSON representation

Articles & Blogs
- How to Create Fake Talking Head Videos With Deep Learning (Code Tutorial)
- AudioGPT: Understanding and Generating Speech, Music, Sound - modal AI system that can process complex audio information and understand and generate speech, music, sound, and talking head content.
- Text-based Editing of Talking-head Video - head videos using text-based instructions.
- DisCoHead: Audio-and-Video-Driven Talking Head Generation
- Microsoft's 3D Photo Realistic Talking Head
- Talking-head Generation with Rhythmic Head Motion - head videos with natural head movements, addressing the challenge of generating lip-synced videos while incorporating natural head motion. The proposed approach utilizes a 3D-aware generative network along with a hybrid embedding module and a non-linear composition module, resulting in controllable and photo-realistic talking-head videos with natural head movements.
- Learned Spatial Representations for Few-shot Talking-Head Synthesis - shot talking-head synthesis by factorizing the representation of a subject into its spatial and style components. The proposed method predicts a dense spatial layout for the target image and utilizes it for synthesizing the target frame, achieving improved preservation of the subject's identity in the source images.
- High-Fidelity and Freely Controllable Talking Head Video Generation - quality and controllable talking-head videos. It introduces a novel model that leverages self-supervised learned landmarks and 3D face model-based landmarks to model the motion, along with a motion-aware multi-scale feature alignment module. The proposed method produces high-fidelity talking-head videos with free control over head pose and expression.
- Implicit Identity Representation Conditioned Memory Compensation - fidelity talking head generation. The network module learns a unified spatial facial meta-memory bank, which compensates warped source facial features to overcome limitations due to complex motions in the driving video, resulting in improved generation quality.
- Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head - head videos. It proposes an embedding that groups the motion signatures of one identity together, allowing the identification of synthetic videos using the appearance of a specific individual driving the expressions.
- Progressive Disentangled Representation Learning for Fine - shot talking head synthesis method that achieves disentangled control over lip motion, eye gaze & blink, head pose, and emotional expression. It utilizes a progressive disentangled representation learning strategy to isolate each motion factor, allowing for fine-grained control and high-quality speech and lip-motion synchronization.
- VideoReTalking: Audio-based Lip Synchronization for Talking Head - world talking head videos according to input audio. It disentangles the editing task into face video generation, audio-driven lip-sync, and face enhancement, ultimately producing a high-quality and lip-syncing output video. The system utilizes learning-based approaches in a sequential pipeline, without requiring user intervention.
- Text-based Editing of Talking-head Video - head videos using text-based instructions.
- One-Shot Free-View Neural Talking-Head Synthesis for Video - head video synthesis model that learns to synthesize videos using a source image containing the target person's appearance and a driving video for motion. The model achieves high visual quality and bandwidth efficiency, outperforming competing methods on benchmark datasets.
- Efficient Emotional Adaptation for Audio-Driven Talking-Head - driven Talking-head (EAT) method, which transforms emotion-agnostic talking-head models into emotion-controllable ones in a cost-effective and efficient manner. The approach utilizes lightweight adaptations to enable precise and realistic emotion controls, achieving state-of-the-art performance on widely-used benchmarks.
- Style Transfer for 2D Talking Head Animation - pattern construction, and a style-aware image generator. The method achieves better results than recent state-of-the-art methods in generating photo-realistic and fidelity 2D animation.
Online Courses
- Video Production: You Can Make Simple Talking Head Video | Udemy
- The Complete Talking Head Video Production Masterclass | Udemy - depth course.
- Video Production - Inexpensive Talking Head Video - Business | Udemy
- How to Create a Talking Head Video | Udemy - friendly course that covers the technicalities of filming and creating talking head videos.
Research Papers
- Talking-Heads Attention - heads attention," a variation on multi-head attention that improves language modeling and comprehension tasks.
- MakeItTalk: Speaker-Aware Talking-Head Animation
- StyleTalk: One-shot Talking Head Generation with Controllable - shot talking heads with diverse personalized speaking styles.
- DiffTalk: Crafting Diffusion Models for Generalized - driven denoising process using Latent Diffusion Models.
- One-Shot High-Fidelity Talking-Head Synthesis with Deformable - fidelity talking heads by employing explicit 3D structural representations.
- AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head - fidelity talking head videos directly from input audio using neural scene representation networks.
- What comprises a good talking-head video generation?: A Survey - head video generation, addressing the limitations of subjective evaluations. It explores desired properties such as identity preservation, lip synchronization, high video quality, and natural-spontaneous motion.
- Text-based Editing of Talking-head Video - head videos based on their transcript, allowing for modifications in speech content while maintaining a seamless audio-visual flow. It utilizes annotations of facial features and a parametric face model for realistic video output.
- MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation - preserving talking head generation framework that utilizes dense landmarks for accurate geometry-aware flow fields. It also proposes adaptive fusion of source identity during synthesis and a fast adaptation model using meta-learning for personalized fine-tuning.
- Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis - shot talking head synthesis, which generalizes to unseen identities with limited training data. It conditions face radiance field on 2D appearance images, allowing flexible adjustment to new identities with few reference images.
- Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motions - driven talking-head method that produces photo-realistic videos from a single reference image. It addresses challenges of producing natural head motions matching speech prosody while maintaining appearance during large head motions. It utilizes a head pose predictor and a motion field generator.
- Talking Head Generation with Probabilistic Audio-to-Visual Diffusion - shot audio-driven talking head generation using a probabilistic approach. It probabilistically generates facial motions matching input audio while maintaining audio-lip synchronization and overall photo-realism. It avoids the need for additional driving sources for controlled synthesis.
- Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Video Generation - based talking-head video generation framework that synthesizes facial expressions and head motions according to contextual sentiments and speech rhythm. It consists of a speaker-independent stage and a speaker-specific stage, allowing tailored video synthesis for different individuals.
- Few-Shot Adversarial Learning of Realistic Neural Talking Head
- Depth-Aware Generative Adversarial Network for Talking Head - supervised geometry learning method and leveraging dense 3D facial geometry for accurate talking head video generation.
Tools & Software
- face3D_chung
- CrazyTalk
- tts avatar free download - SourceForge - browser, multi-platform talking head. (🔧👄)
- Best Open Source BASIC 3D Modeling Software
- DVDStyler / Discussion / Help: ffmpeg-vbr or internal
- LUCIA - 4 Talking Head Engine. 💻
- Yepic Studio - style videos in minutes without expensive equipment. 🎥
- Mel McGee's Talkbots - browser, multi-platform talking head application in SVG suitable for web sites or as an avatar. 🗣️
- Verbatim AI - Product Information, Latest Updates, and Reviews 2023 - time with Verbatim AI. Add interest, intrigue, and dynamism to your chat bots! (🔧👄)
- puffin web browser free download - SourceForge - browser, multi-platform talking head. (🔧👄)
- 12 best AI video generators to use in 2023 [Free and paid - quality videos from scratch. (🔧🎥)
Slides & Presentations
- (Paper Review)Few-Shot Adversarial Learning of Realistic Neural Talking Head Models - shot adversarial learning of realistic neural talking head models.
- Nethania Michelle's Character | PPT
- Research Presentation | PPT
- Awesome List Generator - source Python package that uses the power of GPT models to automatically curate and generate starting points for resource lists related to a specific topic.
- Presenting you: Top tips on presenting with Prezi Video – Prezi
- Adding narration to your presentation (using Prezi Video) – Prezi ...
GitHub projects
- AudioGPT
- SadTalker - Driven Single Image Talking Face Animation. 🎭🎶
- Thin-Plate-Spline-Motion-Model - Plate Spline Motion Model for Image Animation. 🖼️
- GeneFace - Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code. 👤💬
- CVPR2022-DaGAN - Aware Generative Adversarial Network for Talking Head Video Generation. 👥📹
- sd-wav2lip-uhq
- Text2Video - driven talking-head video synthesis with phonetic dictionary". 🔤🎞️
- OTAvatar - shot Talking Face Avatar with Controllable Tri-plane Rendering [CVPR2023]. 👤🎭
- Audio2Head - driven One-shot Talking-head Generation with Natural Head Motion" in the conference of IJCAI 2021. 🗣️👤
- IP_LAP - Preserving Talking Face Generation With Landmark and Appearance Priors. 🔥🤖
- Wunjo AI - time speech recognition, deepfake face & lips animation, face swap with one photo, change video by text prompts, segmentation, and retouching. Open-source, local & free. 🗣️👤💬
- LIHQ - Inference, High Quality Synthetic Speaker (AI avatar/ AI presenter). 🎙️👤
- Co-Speech-Motion-Generation
- Neural Head Reenactment with Latent Pose Descriptors
- NED
- WACV23_TSNet - identity Video Motion Retargeting with Joint Transformation and Synthesis". 🎬✨
- ICCV2023-MCNET
- Speech2Video
- StyleLipSync - based Personalized Lip-sync Video Generation". 💋🎥

Programming Languages

Python 16 Jupyter Notebook 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

awesome-ai-talking-heads

Articles & Blogs

Online Courses

Research Papers

Tools & Software

Slides & Presentations

GitHub projects