DreamActor-M1

Holistic, Expressive and Robust Human Image Animation
with Hybrid Guidance

A cutting-edge DiT-based human animation framework with hybrid guidance for fine-grained holistic controllability, multi-scale adaptability, and long-term temporal coherence.

DreamActor-M1 Demo

Human Animation with Hybrid Guidance

DiT-based Framework
Hybrid Control Signals

Method Overview

DreamActor-M1 is a DiT-based human animation framework with hybrid guidance that enables fine-grained control over facial expressions and body movements.

DreamActor-M1 Architecture

Pose Encoder
Extracts body skeletons & head spheres
DiT Blocks
Diffusion transformer processing
Face Motion
Implicit facial representations
3D VAE
Video latent encoding/decoding
Training Process
The end-to-end training framework of DreamActor-M1

During the training stage, we first extract body skeletons and head spheres from driving frames and then encode them to the pose latent using the pose encoder.

The resultant pose latent is combined with the noised video latent along the channel dimension. The video latent is obtained by encoding a clip from the input full video using 3D VAE.

Facial expression is additionally encoded by the face motion encoder, to generate implicit facial representations. These are integrated via cross-attention within each DiT block.

3D

Head Sphere Control

DiT

Diffusion Transformer

3D

Body Skeleton Control

Results & Capabilities

DreamActor-M1 delivers expressive results for portraits, upper-body, and full-body generation with robust long-term consistency.

Character and Motion Style Diversity

Our method is robust to various character and motion styles.

Style Example 1

Character & Motion

Style Example 2

Character & Motion

Style Example 3

Character & Motion

Experience the Future of Human Animation

DreamActor-M1 outperforms state-of-the-art works, delivering expressive results for portraits, upper-body, and full-body generation.

Paper & Citation

DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance

Paper Information

Authors

Yuxuan Luo*, Zhengkun Rong*, Lizhen Wang*, Longhao Zhang*, Tianshu Hu*†, Yongming Zhu

*Equal Contribution †Corresponding Author

Affiliation

Bytedance Intelligent Creation

BibTeX Citation

@misc{luo2025dreamactorm1holisticexpressiverobust,
  title={DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance}, 
  author={Yuxuan Luo and Zhengkun Rong and Lizhen Wang and Longhao Zhang and Tianshu Hu and Yongming Zhu},
  year={2025},
  eprint={2504.01724},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2504.01724}, 
}

Ethics Statement

The images and videos used in demos are sourced from public domains or generated by models, and are intended solely to showcase the capabilities of this research. Please contact us (hutianshu007@gmail.com) if there are any concerns, and we will delete it in time.