학술논문

Robust 3D Human Pose Estimation from Single Images or Video Sequences
Document Type
Periodical
Source
IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE Trans. Pattern Anal. Mach. Intell. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 41(5):1227-1241 May, 2019
Subject
Computing and Processing
Bioengineering
Three-dimensional displays
Two dimensional displays
Cameras
Pose estimation
Video sequences
Robustness
Measurement errors
3D human pose estimation
sparse basis
anthropometric constraints
+%24L%5F1%24<%2Ftex-math>++++L<%2Fmml%3Ami>+1<%2Fmml%3Amn>+<%2Fmml%3Amsub>+<%2Fmml%3Amath>++<%2Falternatives>+<%2Finline-formula>-norm+penalty+function%22"> $L_1$ L 1 -norm penalty function
Language
ISSN
0162-8828
2160-9292
1939-3539
Abstract
We propose a method for estimating 3D human poses from single images or video sequences. The task is challenging because: (a) many 3D poses can have similar 2D pose projections which makes the lifting ambiguous, and (b) current 2D joint detectors are not accurate which can cause big errors in 3D estimates. We represent 3D poses by a sparse combination of bases which encode structural pose priors to reduce the lifting ambiguity. This prior is strengthened by adding limb length constraints. We estimate the 3D pose by minimizing an $L_1$L1 norm measurement error between the 2D pose and the 3D pose because it is less sensitive to inaccurate 2D poses. We modify our algorithm to output $K$K 3D pose candidates for an image, and for videos, we impose a temporal smoothness constraint to select the best sequence of 3D poses from the candidates. We demonstrate good results on 3D pose estimation from static images and improved performance by selecting the best 3D pose from the $K$K proposals. Our results on video sequences also show improvements (over static images) of roughly 15%.