Generating Dance Videos using Pose Transfer Generative Adversarial Network with Multiple Scale Region Extractor and Learnable Region Normalization

Hsu Yung Cheng, Chih Chang Yu, Chih Lung Lin

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we propose a pose transfer framework that can deal with large body motion to generate dance videos. To solve the problem of body shape deformation from large movements, a Multiple Scale Region Extractor (MSRE) is proposed. The features of each body region can be extracted from multiple layers of the encoder according to the body key points and passed through shortcuts to the decoder to reduce the spatial information loss. We add a region style loss calculated by the style representations of the body regions to the loss function to improve the quality of the generated images. In addition, the concept of learnable region normalization is integrated in the proposed framework to prevent introducing undesired mean and variance shifts by the corrupted regions during normalization. The experiments have shown that the proposed system can significantly improve the pose generation results compared with existing methods, especially when there are large body movements in the dancing poses.

Original languageEnglish
JournalIEEE Multimedia
DOIs
StateAccepted/In press - 2021

Keywords

  • Convolution
  • Decoding
  • Feature extraction
  • Generative adversarial network
  • Generative adversarial networks
  • Strain
  • Urban areas
  • Videos
  • deep learning
  • image generation
  • pose transfer

Fingerprint

Dive into the research topics of 'Generating Dance Videos using Pose Transfer Generative Adversarial Network with Multiple Scale Region Extractor and Learnable Region Normalization'. Together they form a unique fingerprint.

Cite this