To facilitate the analysis of human actions, interactions and emotions, we compute a 3D model of human body pose, hand pose, and facial expression from a single monocular image. To achieve this, we use thousands of 3D scans to train a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with fully articulated hands and an expressive face. Learning to regress the parameters of SMPL-X directly from images is challenging without paired images and 3D ground truth. Consequently, we follow the approach of SMPLify, which estimates 2D features and then optimizes model parameters to fit the features. We improve on SMPLify in several significant ways: (1) we detect 2D features corresponding to the face, hands, and feet and fit the full SMPL-X model to these; (2) we train a new neural network pose prior using a large MoCap dataset; (3) we define a new interpenetration penalty that is both fast and accurate; (4) we automatically detect gender and the appropriate body models (male, female, or neutral); (5) our PyTorch implementation achieves a speedup of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild. We evaluate 3D accuracy on a new curated dataset comprising 100 images with pseudo ground-truth. This is a step towards automatic expressive human capture from monocular RGB data. The models, code, and data are available for research purposes at https://smpl-x.is.tue.mpg.de.
翻译:为了便于分析人类的动作、相互作用和情感,我们从单一的单幅图像中计算出人体摆布、手摆容和面部表达的3D模型。为了实现这一点,我们用数千个3D扫描来训练一个新的、统一的、3D的人体模型,SMPL-X,该模型将SMPL-X用完全清晰的手和表情扩展成一个完整的SMPL;学习直接从图像中反向 SMPL-X 参数具有挑战性,没有配对图像和3D 地面真实性。因此,我们遵循SMPLify的方法,该模型估计2D 特征,然后优化模型参数以适应这些特征。我们用几个重要方法改进了SMPL的SMP:(1) 我们检测了与脸、手和脚相对应的2D特性,使完整的SMPLX模型在使用一个新的双向双向的SrampL数据。我们用SyLMD模型来自动检测性别和适当的身体模型(男性、女性或中性); (5)我们用SyL的直径图像的精确性模型,在SyL上实现一个新的Smex数据。