Synthesis of long-term human motion skeleton sequences is essential to aid human-centric video generation with potential applications in Augmented Reality, 3D character animations, pedestrian trajectory prediction, etc. Long-term human motion synthesis is a challenging task due to multiple factors like, long-term temporal dependencies among poses, cyclic repetition across poses, bi-directional and multi-scale dependencies among poses, variable speed of actions, and a large as well as partially overlapping space of temporal pose variations across multiple class/types of human activities. This paper aims to address these challenges to synthesize a long-term (> 6000 ms) human motion trajectory across a large variety of human activity classes (>50). We propose a two-stage activity generation method to achieve this goal, where the first stage deals with learning the long-term global pose dependencies in activity sequences by learning to synthesize a sparse motion trajectory while the second stage addresses the generation of dense motion trajectories taking the output of the first stage. We demonstrate the superiority of the proposed method over SOTA methods using various quantitative evaluation metrics on publicly available datasets.
翻译:长期人类运动合成是一项具有挑战性的任务,因为多种因素,例如:各种成份之间的长期时间依赖性、跨成份的循环重复性、各种成份之间的双向和多尺度依赖性、行动速度的变异性、以及大量和部分重叠的时空空间,在人类活动的多个类别/类型中造成差异。本文件旨在应对这些挑战,以综合各种人类活动类别的长期(大于6000米)人类运动轨迹(>50)。我们提出了实现这一目标的两阶段活动生成方法,第一阶段通过学习合成微小运动轨迹,学习活动序列中的长期全球成份依赖性,而第二阶段则处理取自第一阶段产出的密集运动轨迹的产生。我们用公开数据集的各种定量评价指标,展示了拟议方法优于SOTA方法的优越性。