Total Selfie: Generating Full-Body Selfies

results.

Total Selfie generates full-body selfies of you (right), from self-captured images of your face and body (top left) and background. You can choose any target pose from a reference photo -- we auto-select a set of good candidates from your photo collection (bottom).

Abstract

We present a method to generate full-body selfies from photographs originally taken at arms length. Because self-captured photos are typically taken close up, they have limited field of view and exaggerated perspective that distorts facial shapes. We instead seek to generate the photo some one else would take of you from a few feet away. Our approach takes as input four selfies of your face and body, a background image, and generates a full-body selfie in a desired target pose. We introduce a novel diffusion-based approach to combine all of this information into high-quality, well-composed photos of you with the desired pose and background.


Video

Total Selfie

NPP-Net architecture.

Overview of Total Selfie. First, we train a selfie-conditioned inpainting model based on a synthetic selfie to full-body dataset (blue box). Second, we fine-tune the trained model on a specific capture (orange box), and use it to produce a full-body selfie with the help of modified ControlNet (for pose) and appearance refinement (for face and shoes), visualized in the purple box. Note that, images in the green dashed box (inside the orange box) serve as input and conditional signals to the inpainting model, arrows omitted for simplicity.


Experiments

Limitaions

Total Selfie has several limitations: (1) While our method generally yields a harmonized output (b), the shading of the body may not precisely align with that in the actual photo (c). (2) Our method cannot accurately generate hard shadows of a person under strong sunlight since inferring the sun's direction and scene geometry solely from the background is difficult.

NPP-Net architecture.

Acknowledgements

This work was supported by the UW Reality Lab, Meta, Google, OPPO, and Amazon. Special thanks to Jingyi Lin, Jingwei Ma, Kunbo Ding, Qichen Fu, Bohan Chen, Yu Pan, Yuqun Wu, Yuhan Fang, and Mengyi Shan, for their support and help with collecting data.