Publication Type

Journal Article

Version

acceptedVersion

Publication Date

1-2025

Abstract

Human pose estimation (HPE) models underperform in recognizing rare poses because they suffer from data imbalance problems (i.e., there are few image samples for rare poses) in their training datasets. From a data perspective, the most intuitive solution is to synthesize data for rare poses. Specifically, the rule-based methods apply manual manipulations (such as Cutout and GridMask) to the existing data, so the limited diversity of the data constrains the model. An alternative method is to learn the underlying data distribution via deep generative models (such as ControlNet and HumanSD) and then sample “new data” from the distribution. This works well for generating frequent poses in common scenes, but suffers when applied to rare poses or complex scenes (such as multiple persons with overlapping limbs). In this paper, we aim to address the above two issues, i.e., rare poses and complex scenes, for person image generation. We propose a two-stage method. In the first stage, we design a controllable pose generator named PoseFactory to synthesize rare poses. This generator is specifically trained on augmented pose data, and each pose is labelled with its level of difficulty and rarity. In the second stage, we introduce a multi-person image generator named MultipGenerator. It is conditioned on multiple human poses and textual descriptions of complex scenes. Both stages are controllable in terms of the diversity of poses and the complexity of scenes. For evaluation, we conduct extensive experiments on three widely used datasets: MS-COCO, HumanArt, and OCHuman. We compare our method against traditional pose data augmentation and person image generation methods, and it demonstrates its superior performance both quantitatively and qualitatively.

Discipline

Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces

Areas of Excellence

Digital transformation

Publication

IEEE Transactions on Multimedia

First Page

1

Last Page

13

ISSN

1520-9210

Publisher

Institute of Electrical and Electronics Engineers

Comments

Accepted in Jan 2025, now we are waiting for TMM's proofreading and publication notification.

Share

COinS