【泡泡一分钟】基于时间滑动LSTM网络的基于骨架动作识别（ICCV2017-106）

会员服务 ·

【泡泡一分钟】基于时间滑动LSTM网络的基于骨架动作识别（ICCV2017-106）

2018 年 9 月 27 日 泡泡机器人SLAM

每天一分钟，带你读遍机器人顶级会议文章

标题：Ensemble Deep Learning for Skeleton-based Action Recognition using Temporal Sliding LSTM networks

作者：Inwoong Lee, Doyoung Kim, Seoungyoon Kang, Sanghoon Lee

来源：ICCV 2017 ( IEEE International Conference on Computer Vision)

编译：李建禹

审核：陈世浪

欢迎个人转发朋友圈；其他机构或自媒体如需转载，后台留言申请授权

摘要

本文讨论了骨架关节的特征表示和时间动态建模来识别人的动作。传统方法一般使用相对依赖于某些关节的相对坐标系，只对长期依赖性进行建模，而不包括短期和中期依赖关系。本文将骨架转换成到另一个坐标系，代替原始骨架作为输入，以获得对尺度、旋转和平移的鲁棒性，然后从它们中提取显著的运动特征。考虑到具有不同时间步长的LSTM网络能够很好地模拟各种属性，本文新提出了针对骨架的动作识别的时间滑动LSTM（TS- LSTM）网络。所提出的网络由多个部分组成，分别包含短期、中期和长期的TS- LSTM网络。在此网络中，我们利用多个部分之间的平均集合作为特征来捕获各种时间依赖关系。

本文评估了所提出的网络和一些其他的架构，以验证所提出的网络的有效性，并在5个有挑战性的数据集上的其他方法进行比较。实验结果表明，我们的网络模型通过各种时间特征实现了最先进的性能。另外，我们通过可视化多个部分的softmax特征来分析所识别的动作与不同时长的TS-LSTM特征之间的关系。

图1 系统的整体框架

图2 提出的TS-LSTM模块概念图

图3 由短期、中期、长期和姿态TS－LSTM模块组成的整体体系结构

Abstract

This paper addresses the problems of feature representation of skeleton joints and the modeling of temporal dynamics to recognize human actions. Traditional methods generally use relative coordinate systems dependent on some joints, and model only the long-term dependency, while excluding short-term and medium term dependencies. Instead of taking raw skeletons as the input, we transform the skeletons into another coordinate system to obtain the robustness to scale, rotation and translation, and then extract salient motion features from them. Considering that Long Shortterm Memory (LSTM) networks with various time-step sizes can model various attributes well, we propose novel ensemble Temporal Sliding LSTM (TS-LSTM) networks for skeleton-based action recognition. The proposed network is composed of multiple parts containing short-term, mediumterm and long-term TS-LSTM networks, respectively. In our network, we utilize an average ensemble among multiple parts as a ﬁnal feature to capture various temporal dependencies. We evaluate the proposed networks and the additional other architectures to verify the effectiveness of the proposed networks, and also compare them with several other methods on ﬁve challenging datasets. The experimental results demonstrate that our network models achieve the state-of-the-art performance through various temporal features. Additionally, we analyze a relation between the recognized actions and the multi-term TS-LSTM features by visualizing the softmax features of multiple parts.

如果你对本文感兴趣，想要下载完整文章进行阅读，可以关注【泡泡机器人SLAM】公众号（paopaorobot_slam）。

欢迎来到泡泡论坛，这里有大牛为你解答关于SLAM的任何疑惑。

有想问的问题，或者想刷帖回答问题，泡泡论坛欢迎你！

泡泡网站：www.paopaorobot.org

泡泡论坛：http://paopaorobot.org/forums/

泡泡机器人SLAM的原创内容均由泡泡机器人的成员花费大量心血制作而成，希望大家珍惜我们的劳动成果，转载请务必注明出自【泡泡机器人SLAM】微信公众号，否则侵权必究！同时，我们也欢迎各位转载到自己的朋友圈，让更多的人能进入到SLAM这个领域中，让我们共同为推进中国的SLAM事业而努力！

商业合作及转载请联系liufuqiang_robot@hotmail.com

登录查看更多

相关内容

长短期记忆网络

关注 120

长短期记忆网络(LSTM)是一种用于深度学习领域的人工回归神经网络(RNN)结构。与标准的前馈神经网络不同，LSTM具有反馈连接。它不仅可以处理单个数据点(如图像)，还可以处理整个数据序列(如语音或视频)。例如，LSTM适用于未分段、连接的手写识别、语音识别、网络流量或IDSs(入侵检测系统)中的异常检测等任务。

【ACL20-哈工大】基于图注意力网络的多粒度机器阅读理解文档建模

专知会员服务

42+阅读 · 2020年7月1日

【ICML2020-西电】用于语言生成的递归层次主题引导RNN

专知会员服务

22+阅读 · 2020年6月30日

【CMU】基于图神经网络的联合检测与多目标跟踪

专知会员服务

59+阅读 · 2020年6月24日

基于深度学习的表面缺陷检测方法综述

专知会员服务

89+阅读 · 2020年5月31日