Light

harlanhong / awesome-talking-head-generation Goto Github PK

View Code? Open in Web Editor NEW

1.3K 73.0 104.0 108 KB

face-reenactment image-animation motion-transfer talking-head

awesome-talking-head-generation's Introduction

awesome-talking-head-generation

Papers for Talking Head Generation, released codes collections.

Any addition or bug about talking head generation,please open an issue, pull requests or e-mail me by [email protected]. If you are researching in talking head generation task, you can add my discord account: Fa-Ting Hong#6563 for better communication and cooperations.

🔥I am currently seeking a job or postdoctoral position. If you are interested in my qualifications and experience, please feel free to contact me. 🔥

Related Group

Datasets

VoxCeleb1 [Download link].
VoxCeleb2 [Download link].
Faceforensics++ [Download link].
CelebV [Download link].
TalkingHead-1KH [Download link].
LRW (Lip Reading in the Wild) [Download link].
MEAD [Download link].
CelebV-HQ [Download link].
CHDTF [Download link].

Image-driven

2016

[Face2face] Face2face: Real-time face capture and reenactment of RGB videos, CVPR 2016.

2018

[ReenactGAN] ReenactGAN: Learning to Reenact Faces via Boundary Transfer, ECCV 2018. [Code].
[X2Face] X2Face: A network for controlling face generation by using images, audio, and pose codes, ECCV 2018. [Code], [Project].

2019

[FOMM] First order motion model for image animation, NeurIPS 2019. [Code].
[NeuralHead]Few-Shot Adversarial Learning of Realistic Neural Talking Head models, ICCV 2019. [Code].
[Monkey-Net]Animating Arbitrary Objects via Deep Motion Transfer, CVPR 2019 Oral. [Code], [Project].
[fs-vid2vid]Few-shot Video-to-Video Synthesis, NeurIPS 2019. [Code], [Project].

2020

[MeshG] Mesh Guided One-shot Face Reenactment Using Graph Convolutional Networks, ACM Multimedia 2020. [Code].
[MarioNETte] MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets, AAAI 2020. [Project].
[CrossID-GAN] Learning Identity-Invariant Motion Representations for Cross-ID Face Reenactment, CVPR 2020.

2021

[face-vid2vid] One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing, CVPR 2021 Oral. [Project].
[S2D] Sparse to Dense Motion Transfer for Face Image Animation, ICCV 2021.
[SAFA] SAFA: Structure Aware Face Animation, 3DV 2021. [Code]
[SAA] Self-appearance-aided Differential Evolution for Motion Transfer, arXiv 2021.
[PIRenderer]PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering, ICCV 2021. [Code]
[FaceGAN]FACEGAN: Facial Attribute Controllable rEenactment GAN, WACV 2021.
[F^3A-GAN]F3A-GAN: Facial Flow for Face Animation With Generative Adversarial Networks, IEEE TIP 2021.
[FACIAL]FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning, ICCV 2021.
[MRAA] Motion Representations for Articulated Animation, CVPR 2021. [Code]
[HeadGAN]HeadGAN: One-shot Neural Head Synthesis and Editing, ICCV 2021. [Project]

2022

[DaGAN]Depth-Aware Generative Adversarial Network for Talking Head Video Generation, CVPR 2022. [Code], [Project]
[TPSM]Thin-Plate Spline Motion Model for Image Animation, CVPR 2022. [Code]
[StyleHEAT]StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pretrained StyleGAN, ECCV 2022. [Code], [Project]
[MegaPortraits]MegaPortraits: One-shot Megapixel Neural Head Avatars, ACM MM 2022. [Project]
[DAM]Structure-Aware Motion Transfer with Deformable Anchor Model, CVPR 2022. [Code]
[StyleMask]StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment, FG, 2023. [Code]
[CoRF]Controllable Radiance Fields for Dynamic Face Synthesis, Arxiv 2022.
[AniFaceGAN]Animatable 3D-Aware Face Image Generation for Video Avatars, NeurIPS 2022. [Project]
[IW]Implicit Warping for Animation with Image Sets, NeurIPS 2022. [Project]
[HifiHead]HifiHead: One-Shot High Fidelity Neural Head Synthesis with 3D Control, IJCAI 2022.
Face Animation with Multiple Source Images, Arxiv 2022.
[MetaPortrait]MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation, Arxiv 2022.
Compressing Video Calls using Synthetic Talking Heads, BMVC 2022. [Project]
Finding Directions in GAN’s Latent Space for Neural Face Reenactment, BMVC 2022. [Project] [Code]
[LIA]Latent Image Animator: Learning to Animate Images via Latent Space Navigation, ICLR 2022. [Project] [Code]

2023

[AVFR-GAN]Audio-Visual Face Reenactment, WACV 2023. [Code], [Project]
[TS-Net]Cross-identity Video Motion Retargeting with Joint Transformation and Synthesis, WACV 2023. [Code]
[MCNET]Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation, ICCV 2023. [Project] [Code]

2024

[X-Portrait] X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention, arXiv 2024.
[LivePortrait] LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control [Code] [Project]
[EMOPortraits] EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars, CVPR 2024. [Code], [Project]

Audio-driven

2016

[LRW] Lip Reading in the Wild, ACCV 2016.

2017

[Synthesizing-Obama] Synthesizing Obama: Learning Lip Sync From Audio, SIGGRAPH 2017. [Project].
[You-Said-That?] You Said That?: Synthesising Talking Faces From Audio, IJCV 2019. [Code].
Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion, SIGGRAPH 2017.
A Deep Learning Approach for Generalized Speech Animation, SIGGRAPH 2017.

2018

Lip Movements Generation at a Glance, ECCV 2018. [Code].
[VisemeNet] VisemeNet: Audio-Driven Animator-Centric Speech Animation, SIGGRAPH 2018.

2019

[DAVS] Talking Face Generation by Adversarially Disentangled Audio-Visual Representation, AAAI 2019. [Code].
[ATVGnet] Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss, CVPR 2019. [Code]

2020

[Wav2Lip] A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild, ACM Multimedia 2020. [Code], [Project].
[RhythmicHead] Talking-head Generation with Rhythmic Head Motion, ECCV 2020. [Code].
[MakeItTalk] MakeItTalk: Speaker-Aware Talking-Head Animation, SIGGRAPH Asia 2020. [Code], [Project].
[Neural Voice Puppetry] Neural Voice Puppetry: Audio-driven Facial Reenactment, ECCV 2020. [Code], [Project].
[MEAD] MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation, ECCV 2020. [Code], [Project].
Realistic Speech-Driven Facial Animation with GANs, IJCV 2020.

2021

[PC-AVS] Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation, CVPR 2021. [Code], [Project].
[IATS]Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis,ACM Multimedia 2021..
[EVP] Audio-Driven Emotional Video Portraits, CVPR 2021. [Code]
[FAU] Talking Head Generation with Audio and Speech Related Facial Action Units, arxiv 2021.
[Speech2Talking-Face] Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation, IJCAI 2021.
[IATS] Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis, ACM MM 2021.
[LSP] Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation, ACM TOG 2021. [Code]
[Audio2head] Audio2head: Audio-driven one-shot talking-head generation with natural head motion, ArXiv 2021.

2022

[GC-AVT] Expressive Talking Head Generation with Granular Audio-Visual Control , CVPR 2022.
Talking Face Generation with Multilingual TTS, CVPR 2022. [Demo Track].
[EAMM] EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model, SIGGRAPH 2022.
[SPACEx] SPACEx 🚀: Speech-driven Portrait Animation with Controllable Expression, arXiv 2022. [Project] CVPR 2023
[AV-CAT] Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers, SIGGRAPH Asia 2022.
[MemFace] Memories are One-to-Many Mapping Alleviators in Talking Face Generation, arXiv 2022.

2023

[Diffused Heads] Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation, Arxiv 2023. [Project] 🔥Diffusion🔥
[DiffTalk] DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis, Arxiv 2023. [Project] [Code] 🔥Diffusion🔥
[READ] [READ Avatars: Realistic Emotion-controllable Audio Driven Avatars](READ Avatars: Realistic Emotion-controllable Audio Driven Avatars), Arxiv 2023.
[DAE-Talker] DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder, Arxiv 2023. 🔥Diffusion🔥
[EmoGen] Emotionally Enhanced Talking Face Generation, Arxiv 2023. [Code]
[TalkLip] Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert, CVPR 2023. [Code]
[StyleSync] StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator, CVPR 2023. [Project] [Code]
[GeneFace++] GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation, arXiv 2023. [Project] [Code]
[MODA] MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions, ICCV 2023.
[VividTalk] VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior, Arxiv 2023. [Project] [Code]
[IP_LAP] IP_LAP: Identity-Preserving Talking Face Generation with Landmark and Appearance Priors, CVPR 2023. [Code]
[HyperLips] HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation , CVPR 2023. [Code]
[EAT] Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation, ICCV 2023. [Project] [Code]

2024

[Real3DPortrait] Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis , ICLR 2024. [Project] [Code]
[EMO] Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions , arXiv 2024. [Project] [Code]
[Style2Talker] Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style , AAAI 2024.
[SaaS] Say Anything with Any Style, AAAI 2024.
[MuseTalk] Real-Time High Quality Lip Synchorization with Latent Space Inpainting, [Code].
[VASA-1] VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time, arXiv 2024. [Project].
[THQA] THQA: A Perceptual Quality Assessment Database for Talking Heads, arXiv 2024. [Code].
[Talk3D] Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior, arXiv 2024. [Code] [Project]
[EDTalk] EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis, arXiv 2024. [Code] [Project]
[AniPortrait] AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations, arXiv 2024. [Code]
[FlowVQTalker] FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization, arXiv 2024.
[FaceChain-ImagineID] FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio, arXiv 2024. [Code]
[Hallo] Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation, arXiv 2024. [Code]
[EchoMinic]EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions, arXiv 2024. [Code], [Project]
[RealTalk]RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network
[Emotional Conversation]Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation
[Make Your Actor Talk]Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement
[EMOPortraits] EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars, CVPR 2024. [Code], [Project]

Nerf & 3D

2021

[DFA-NeRF] DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering, arxiv, 2021.
[NerFACE] NerFACE: Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction, CVPR 2021 Oral. [Code], [Project]
[AD-NeRF] AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis, ICCV 2021. [Code], [Code]

2022

[SSP-NeRFF] Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation, arxiv, 2022.
[HeadNeRF] HeadNeRF: A Real-time NeRF-based Parametric Head Model, CVPR 2022. [Code], [Project]
[IMavatar] I M Avatar: Implicit Morphable Head Avatars from Videos, CVPR 2022. [Code]
[ROME] Realistic One-shot Mesh-based Head Avatars, ECCV 2022.
[FNeVR] FNeVR: Neural Volume Rendering for Face Animation, Arxiv 2022. [Code]
[3DFaceShop] 3DFaceShop: Explicitly Controllable 3D-Aware Portrait Generation, Arxiv 2022. [Code],[Project]
[Next3D] Generative Neural Texture Rasterization for 3D-Aware Head Avatars, Arxiv 2022.[Project]
[NeRFInvertor] NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation, Arxiv 2022.
[DFRF] Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis, ECCV 2022. [Code]

2024

[CVTHead] CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer, WACV 2024. [Code].
[Head3D] 3D-Aware Talking-Head Video Motion Transfer, WACV 2024.

Parameter-Based

2020

[DiscoFaceGAN ] Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning , CVPR 2020 Oral. [Code].

Survey

2020

What comprises a good talking-head video generation?: A Survey and Benchmark.

2024

A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos [Code].

Star History

awesome-talking-head-generation's People

Contributors

Stargazers

Watchers

Forkers

lukelluke macroustc kingstorm lyndonlens xiaoyun4 pfeducode freelze onion-liu xjw00654 kingsj0405 wonwizard saber5433 barleyj21 yihe1003 klonggan janfschr luh1124 jaedukseo maxmax2016 chenchy moerehman rohun-tripathi yikang-he pinglmlcv zhangziliang04 anubhav712 aaronm-citai chhaviilli vpegasus mohamedhussein736 zcloud2014 ace6942 embeddedsamurai davidmartinrius sariohara kimwoonggon yangcaoai vishigondi wangsuzhen autismcode robrita strangerstar felixchan9527 feiiyin alex-unnippillil qinb stelabou ukaserge taenggutae dzw001 taowenleon rgb91 runngezhang kundachaikatisha vritansh slaustld azure-arc-0 ironieser pgyilun ngbien83 hoalarious datu0615 mrywhh linfang010 rain305f minkhant1996 dharmikjagodana-baruzotech tiantang007 5l1v3r1 road2018 xzwy paperwave al-dim cxf2015 stonewalking atoaplus ykk648 witchfindertr ifve mahmozaffari catspunch air23zj suryatmodulus howiema amorjnyh yuxis godjealous dungeonmassster shaoyuc3 samggggflynn olibddneg sagg-test oxyo yuangan arosstale ck167493 ego nihaomiao shohanursobuj thanhpham1987

awesome-talking-head-generation's Issues

An alternative project EchoMimic is open-sourced

EchoMimic is capable of generating portrait videos not only by audios and facial landmarks individually, but also by a combination of both audios and selected facial landmarks.
Project link: https://badtobest.github.io/echomimic.html
GitHub link: https://github.com/BadToBest/EchoMimic

demo_001.mp4

Discord link not working

the discord link you provide " https://discord.com/invite/3K74mkQ5 " in the description is not working/expired.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.