Nowadays, the interaction between humans and robots is constantly expanding, requiring more and more human motion recognition applications to operate in real time. However, most works on temporal action detection and recognition perform these tasks in offline manner, i.e. temporally segmented videos are classified as a whole. In this paper, based on the recently proposed framework of Temporal Recurrent Networks, we explore how temporal context and human movement dynamics can be effectively employed for online action detection. Our approach uses various state-of-the-art architectures and appropriately combines the extracted features in order to improve action detection. We evaluate our method on a challenging but widely used dataset for temporal action localization, THUMOS'14. Our experiments show significant improvement over the baseline method, achieving state-of-the art results on THUMOS'14.
翻译:目前,人类和机器人之间的互动正在不断扩大,需要越来越多的人类运动识别应用程序才能实时运行。然而,大多数关于时间行动探测和识别的工作都是以离线方式完成这些任务,即将时间分割的视频分类为整体。在本文中,我们根据最近提出的时间经常性网络框架,探索如何有效地利用时间背景和人类运动动态进行在线行动检测。我们的方法使用各种最先进的结构,并适当地结合所提取的特征,以改进行动检测。我们评估了我们关于具有挑战性但广泛使用的时间行动定位数据集的方法,即THUMOS'14。我们的实验表明,在基准方法上取得了显著的改进,实现了THUMOS'14的最新成果。