The body pose of a person wearing a camera is of great interest for applications in augmented reality, healthcare, and robotics, yet much of the person's body is out of view for a typical wearable camera. We propose a learning-based approach to estimate the camera wearer's 3D body pose from egocentric video sequences. Our key insight is to leverage interactions with another person---whose body pose we can directly observe---as a signal inherently linked to the body pose of the first-person subject. We show that since interactions between individuals often induce a well-ordered series of back-and-forth responses, it is possible to learn a temporal model of the interlinked poses even though one party is largely out of view. We demonstrate our idea on a variety of domains with dyadic interaction and show the substantial impact on egocentric body pose estimation, which improves the state of the art. Video results are available at http://vision.cs.utexas.edu/projects/you2me/
翻译:身着照相机的人的身体姿势对于扩大现实、保健和机器人的应用非常感兴趣,然而,一个人的身体大部分是无法对典型的磨损相机进行观察的。我们建议采用基于学习的方法来估计照相机穿戴者3D身体的姿势,这种姿势来自以自我为中心的视频序列。我们的关键洞察力是利用与另一个人的互动 -- -- 其姿势,我们可以直接观察 -- -- 的姿势,作为与第一人主体的姿势内在相连的信号。我们表明,由于个人之间的相互作用往往产生一系列井然有序的前后反应,因此有可能了解一个互相关联的姿势的时间模型,尽管一个政党基本上无法观察。我们展示了我们关于各种带有dyadic相互作用的领域的想法,并展示了我们对以自我为中心的身体姿势估计的巨大影响,从而改善了艺术的状况。视频结果可在http://vision.cs.utxas.edu/production/you2me/http://vision.curg.c.c.c.c.c.c.utxas.em/productions/yo2me/y2me查阅。