top-down
Papers with tag top-down
2022
- Self-Constrained Inference Optimization on Structural Groups for Human Pose EstimationZhehan Kan, Shuoshuo Chen, Zeng Li, and Zhihai HeIn ECCV 2022
We observe that human poses exhibit strong group-wise structural correlationand spatial coupling between keypoints due to the biological constraints ofdifferent body parts. This group-wise structural correlation can be explored toimprove the accuracy and robustness of human pose estimation. In this work, wedevelop a self-constrained prediction-verification network to characterize andlearn the structural correlation between keypoints during training. During theinference stage, the feedback information from the verification network allowsus to perform further optimization of pose prediction, which significantlyimproves the performance of human pose estimation. Specifically, we partitionthe keypoints into groups according to the biological structure of human body.Within each group, the keypoints are further partitioned into two subsets,high-confidence base keypoints and low-confidence terminal keypoints. Wedevelop a self-constrained prediction-verification network to perform forwardand backward predictions between these keypoint subsets. One fundamentalchallenge in pose estimation, as well as in generic prediction tasks, is thatthere is no mechanism for us to verify if the obtained pose estimation orprediction results are accurate or not, since the ground truth is notavailable. Once successfully learned, the verification network serves as anaccuracy verification module for the forward pose prediction. During theinference stage, it can be used to guide the local optimization of the poseestimation results of low-confidence keypoints with the self-constrained losson high-confidence keypoints as the objective function. Our extensiveexperimental results on benchmark MS COCO and CrowdPose datasets demonstratethat the proposed method can significantly improve the pose estimation results.
- Poseur: Direct Human Pose Regression with TransformersWeian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, and Anton HengelIn ECCV 2022
We propose a direct, regression-based approach to 2D human pose estimationfrom single images. We formulate the problem as a sequence prediction task,which we solve using a Transformer network. This network directly learns aregression mapping from images to the keypoint coordinates, without resortingto intermediate representations such as heatmaps. This approach avoids much ofthe complexity associated with heatmap-based approaches. To overcome thefeature misalignment issues of previous regression-based methods, we propose anattention mechanism that adaptively attends to the features that are mostrelevant to the target keypoints, considerably improving the accuracy.Importantly, our framework is end-to-end differentiable, and naturally learnsto exploit the dependencies between keypoints. Experiments on MS-COCO and MPII,two predominant pose-estimation datasets, demonstrate that our methodsignificantly improves upon the state-of-the-art in regression-based poseestimation. More notably, ours is the first regression-based approach toperform favorably compared to the best heatmap-based pose estimation methods.