Data-driven approaches for modeling human skeletal motion have found various applications in interactive media and social robotics. Challenges remain in these fields for generating high-fidelity samples and robustly reconstructing motion from imperfect input data, due to e.g. missed marker detection. In this paper, we propose a probabilistic generative model to synthesize and reconstruct long horizon motion sequences conditioned on past information and control signals, such as the path along which an individual is moving. Our method adapts the existing work MoGlow by introducing a new graph-based model. The model leverages the spatial-temporal graph convolutional network (ST-GCN) to effectively capture the spatial structure and temporal correlation of skeletal motion data at multiple scales. We evaluate the models on a mixture of motion capture datasets of human locomotion with foot-step and bone-length analysis. The results demonstrate the advantages of our model in reconstructing missing markers and achieving comparable results on generating realistic future poses. When the inputs are imperfect, our model shows improvements on robustness of generation.
翻译:模拟人类骨骼运动的数据驱动方法在交互式媒体和社会机器人中发现了各种应用。在这些领域,由于缺少标记探测等原因,生成高不忠样本和从不完善的投入数据中强有力地重建运动方面仍然存在挑战。我们在本文件中提出了一个概率化模型,以综合和重建以过去信息和控制信号为条件的长期地平线运动序列,如个人移动路径。我们的方法通过引入一个新的图形模型来调整MoGlow的现有工作。模型利用空间时钟图共振网络(ST-GCN)有效捕捉多尺度的骨骼运动数据的空间结构和时间相关性。我们用脚步和骨干长度分析来评价关于运动集捕捉人类运动动动动的数据集的模型。结果显示了我们模型在重建缺失的标记和在产生现实的未来容貌方面取得可比结果方面的优势。当输入不完善时,我们的模型显示了一代的稳健性。