【泡泡一分钟】OFF:快速鲁棒视频动作识别的运动表征

会员服务 ·

【泡泡一分钟】OFF:快速鲁棒视频动作识别的运动表征

2019 年 3 月 12 日 泡泡机器人SLAM

每天一分钟，带你读遍机器人顶级会议文章

标题：Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition

作者：Shuyang Sun, Zhanghui Kuang, Lu Sheng, et al.

来源：2018 IEEE Conference on Computer Vision and Pattern Recognition(CVPR)

编译：陈世浪

审核：颜青松

欢迎个人转发朋友圈；其他机构或自媒体如需转载，后台留言申请授权

摘要

运动表征在视频人类动作识别中起着至关重要的作用。在这项研究中，我们引入了一种新的紧凑运动表征方法，称为“光流指导特征”（OFF），其能快速鲁棒地提取时间和信息。

OFF源自光流，并与光流方向正交。通过计算深度特征图的像素时空梯度，OFF可以仅需少数成本嵌入任何现有的CNN视频动作识别框架中。它使得CNN能够同时提取时空信息，尤其是帧间的时间信息。

实验结果验证了这个简单而强大的想法。OFF仅利用RGB就能达到93.3%的精确度，与同时利用RGB和光流相当，但速度快15倍。实验结果还表明，OFF可以与其他运动方式（如光流等）可以互补。当作者提出的方法嵌入到当前最先进的视频动作识别框架中，在UCF-101和HMDB-51的准确率分别为96%和74.2%。

项目已经开放了源码：https://github.com/kevin-ssy/Optical-Flow-Guided-Feature

Abstract

Motion representation plays a vital role in human action recognition in videos. In this study, we introduce a novel compact motion representation for video action recogni tion, named Optical Flow guided Feature (OFF), which en ables the network to distill temporal information through a fast and robust approach. The OFF is derived from the definition of optical fow and is orthogonal to the optical fow. The derivation also provides theoretical support for using the difference between two frames. By directly cal culating pixel-wise spatio-temporal gradients of the deep feature maps, the OFF could be embedded in any existing CNN based video action recognition framework with only a slight additional cost. It enables the CNN to extract spatio temporal information, especially the temporal information between frames simultaneously. This simple but powerful idea is validated by experimental results. The network with OFF fed only by RGB inputs achieves a competitive accu racy of 93.3% on UCF-101, which is comparable with the result obtained by two streams (RGB and optical fow), but is 15 times faster in speed. Experimental results also show that OFF is complementary to other motion modalities such as optical fow. When the proposed method is plugged into the state-of-the-art video action recognition framework, it has 96.0% and 74.2% accuracy on UCF-101 and HMDB 51 respectively. The code for this project is available at:

如果你对本文感兴趣，想要下载完整文章进行阅读，可以关注【泡泡机器人SLAM】公众号（paopaorobot_slam）。

欢迎来到泡泡论坛，这里有大牛为你解答关于SLAM的任何疑惑。

有想问的问题，或者想刷帖回答问题，泡泡论坛欢迎你！

泡泡网站：www.paopaorobot.org

泡泡论坛：http://paopaorobot.org/forums/