mayuema's Projects
[ACM MM 2022 Oral] AKU: This repo is the official implementation of "Visual Knowledge Graph for Human Action Reasoning in Videos"
Official PyTorch implementation for the paper "AnimateZero: Video Diffusion Models are Zero-Shot Image Animators"
Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
ๆทฑๅบฆๅญฆไน ้ข่ฏๅฎๅ
ธ๏ผๅซๆฐๅญฆใๆบๅจๅญฆไน ใๆทฑๅบฆๅญฆไน ใ่ฎก็ฎๆบ่ง่งใ่ช็ถ่ฏญ่จๅค็ๅSLAM็ญๆนๅ๏ผ
[ECCV 2024] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Pytorch Implementation for "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
[ArXiv 2024] Follow-Your-Canvas: This repo is the official implementation of "Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation"
[arXiv 2024] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts"
[Siggraph Asia 2024] Follow-Your-Emoji: This repo is the official implementation of "Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation"
[arXiv 2023] Follow-Your-Handle: This repo is the official implementation of "MagicStick: Controllable Video Editing via Control Handle Transformations"
[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
[AAAI 2024] This is the implementation for the paper M-BEV: Masked BEV Perception for Robust Autonomous Driving
Just instruction
My HomePage
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
[ACM MM 2022]: Multi-Modal Experience Inspired AI Creation
PyCIL: A Python Toolbox for Class-Incremental Learning
ๆธ
ๅๅคงๅญฆ่ฎก็ฎๆบ็ณป่ฏพ็จๆป็ฅ Guidance for courses in Department of Computer Science and Technology, Tsinghua University
[ICLR2022] official implementation of UniFormer
A Toolkit for Text-to-Video Generation and Editing
[NeurIPS 2022] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training