【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition - 专知VIP

会员服务 ·

4

CVPR 2022 · 时空解耦与重耦 · RGB-D模态 · 跨模态时空信息交互 · 自适应后验融合 ·

2022 年 3 月 19 日

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

专知会员服务

专知，提供专业可信的知识分发服务，让认知协作更快更好！

在行为识别领域，虽然当前的一些基于RGB-D模态的动作识别方法可以取得显著效果，但是他们都是建立在时空紧密耦合架构的基础上进行的时空信息建模。因此，这些方法主要存在以下三个问题：（1）由于时空建模过程的紧密耦合，导致在一些小数据集上面临一定的优化困难；（2）网络中包含的大量与分类无关的边缘冗余信息可能会误导分类器做出错误的决策；（3）视频多模态信息之间缺乏有效的时空交互导致后验融合机制不能充分发挥其作用。所以在本文中，我们提出了一种有效建模时空信息的解耦与重耦合机制以及一种新颖的RGB-D多模态时空信息交互策略。具体来讲，我们将多模态时空信息建模过程分成三个子任务：（1）通过解耦时空建模网络实现高质量维度无关的时间和空间表征学习。（2）重新耦合这些解耦的时空表征以重新建立强时空依赖。（3）引入一种新的跨模态时空信息交互方案和自适应后验融合机制（CAPF）来深度融合RGB-D多模态时空信息。通过充分利用以上技术，可以实现更加鲁棒的时空表征学习。

基于解耦与重耦合机制的多模态时空表征学习网络架构

作者：Benjia Zhou, Pichao Wang, Jun Wan, Yanyan Liang, Fan Wang, Du Zhang, Zhen Lei, Hao Li, Rong Jin

成为VIP会员查看完整内容

14

相关内容

CVPR 2022

CVPR 2022 将于2022年 6 月 21-24 日在美国的新奥尔良举行。CVPR是IEEE Conference on Computer Vision and Pattern Recognition的缩写，即IEEE国际计算机视觉与模式识别会议。该会议是由IEEE举办的计算机视觉和模式识别领域的顶级会议，会议的主要内容是计算机视觉与模式识别技术。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

专知会员服务

27+阅读 · 2022年3月22日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

【CVPR 2022】基于视觉-语言验证和迭代推理的视觉定位,Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

【CVPR 2022】基于视觉-语言验证和迭代推理的视觉定位,Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

专知会员服务

12+阅读 · 2022年3月19日

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

【CVPR 2022】通过动态梯度调制平衡视听学习，Balanced Audio-visual Learning via On-the-fly Gradient Modulation

【CVPR 2022】通过动态梯度调制平衡视听学习，Balanced Audio-visual Learning via On-the-fly Gradient Modulation

专知会员服务

9+阅读 · 2022年3月12日

【CVPR 2022】盲图像超分辨率退化分布的研究，Learning the Degradation Distribution for Blind Image Super-Resolution

【CVPR 2022】盲图像超分辨率退化分布的研究，Learning the Degradation Distribution for Blind Image Super-Resolution

专知会员服务

7+阅读 · 2022年3月12日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

CVPR 2021 | 中科院自动化所、字节跳动提出高性能的指代性分割基准模型

CVPR 2021 | 中科院自动化所、字节跳动提出高性能的指代性分割基准模型

机器之心

2+阅读 · 2021年5月1日

【泡泡图灵智库】ContextDesc：用跨模态上下文增强的局部描述子

【泡泡图灵智库】ContextDesc：用跨模态上下文增强的局部描述子

泡泡机器人SLAM

34+阅读 · 2019年9月18日

【泡泡图灵智库】基于RGB-D相机多视图深度学习的一致语义建图

【泡泡图灵智库】基于RGB-D相机多视图深度学习的一致语义建图

泡泡机器人SLAM

12+阅读 · 2019年9月3日

TPAMI 2019 | 鲁棒RGB-D人脸识别

TPAMI 2019 | 鲁棒RGB-D人脸识别

计算机视觉life

11+阅读 · 2019年6月8日

【泡泡图灵智库】NM-Net：基于邻接点一致性的鲁邦特征点匹配（CVPR）

【泡泡图灵智库】NM-Net：基于邻接点一致性的鲁邦特征点匹配（CVPR）

泡泡机器人SLAM

35+阅读 · 2019年4月28日

【泡泡点云时空】基于选择性传感器融合的神经网络视觉里程计

【泡泡点云时空】基于选择性传感器融合的神经网络视觉里程计

泡泡机器人SLAM

18+阅读 · 2019年4月21日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

论文浅尝 | Global Relation Embedding for Relation Extraction

论文浅尝 | Global Relation Embedding for Relation Extraction

开放知识图谱

12+阅读 · 2019年3月3日

行为识别（action recognition）目前的难点在哪？

行为识别（action recognition）目前的难点在哪？

极市平台

36+阅读 · 2019年2月14日

【泡泡图灵智库】PointNet：用于三维分类与分割的点集深度学习（CVPR）

【泡泡图灵智库】PointNet：用于三维分类与分割的点集深度学习（CVPR）

泡泡机器人SLAM

11+阅读 · 2019年1月20日

基于时空模式的复杂行为识别方法研究

国家自然科学基金

2+阅读 · 2017年12月31日

基于深层特征学习的RGB-D人体行为识别方法

国家自然科学基金

4+阅读 · 2015年12月31日

基于多任务稀疏学习的视频行为理解

国家自然科学基金

0+阅读 · 2014年12月31日

基于知识迁移的跨领域人体动作识别

国家自然科学基金

5+阅读 · 2013年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

面向RGB-D视频的人体动作识别研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于深度和图像融合的特征判别模型及其在形变三维人脸跟踪中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

车载激光扫描点云与全景影像的高精度配准方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于时空流形学习与概率图模型的人体动作识别

国家自然科学基金

2+阅读 · 2012年12月31日

视频数据空间化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

THORN: Temporal Human-Object Relation Network for Action Recognition

THORN: Temporal Human-Object Relation Network for Action Recognition

Arxiv

0+阅读 · 2022年4月20日

A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

Arxiv

0+阅读 · 2022年4月20日

Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing

Arxiv

1+阅读 · 2022年4月19日

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Arxiv

0+阅读 · 2022年4月19日

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

Arxiv

0+阅读 · 2022年4月19日

TSception: Capturing Temporal Dynamics and Spatial Asymmetry from EEG for Emotion Recognition

Arxiv

0+阅读 · 2022年4月18日

End-to-end Weakly-supervised Multiple 3D Hand Mesh Reconstruction from Single Image

Arxiv

0+阅读 · 2022年4月18日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

Learning Conceptual-Contexual Embeddings for Medical Text

Arxiv

27+阅读 · 2019年8月16日

VIP会员

相关主题

时空解耦与重耦

跨模态时空信息交互

自适应后验融合

相关VIP内容

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

专知会员服务

27+阅读 · 2022年3月22日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

【CVPR 2022】基于视觉-语言验证和迭代推理的视觉定位,Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

【CVPR 2022】基于视觉-语言验证和迭代推理的视觉定位,Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

专知会员服务

12+阅读 · 2022年3月19日

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

【CVPR 2022】通过动态梯度调制平衡视听学习，Balanced Audio-visual Learning via On-the-fly Gradient Modulation

【CVPR 2022】通过动态梯度调制平衡视听学习，Balanced Audio-visual Learning via On-the-fly Gradient Modulation

专知会员服务

9+阅读 · 2022年3月12日

【CVPR 2022】盲图像超分辨率退化分布的研究，Learning the Degradation Distribution for Blind Image Super-Resolution

【CVPR 2022】盲图像超分辨率退化分布的研究，Learning the Degradation Distribution for Blind Image Super-Resolution

专知会员服务

7+阅读 · 2022年3月12日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

热门VIP内容

开通专知VIP会员享更多权益服务

【ACL2025教程】大语言模型的护栏与安全性：对其应用的安全、可靠与可控引导

《实现协同自主：从人机协作到多智能体系统》最新190页

【ICML2025】SToFM：一种用于空间转录组学的多尺度基础模型

通信网络智能体白皮书V1.0，61页pdf

相关资讯

CVPR 2021 | 中科院自动化所、字节跳动提出高性能的指代性分割基准模型

CVPR 2021 | 中科院自动化所、字节跳动提出高性能的指代性分割基准模型

机器之心

2+阅读 · 2021年5月1日

【泡泡图灵智库】ContextDesc：用跨模态上下文增强的局部描述子

【泡泡图灵智库】ContextDesc：用跨模态上下文增强的局部描述子

泡泡机器人SLAM

34+阅读 · 2019年9月18日

【泡泡图灵智库】基于RGB-D相机多视图深度学习的一致语义建图

【泡泡图灵智库】基于RGB-D相机多视图深度学习的一致语义建图

泡泡机器人SLAM

12+阅读 · 2019年9月3日

TPAMI 2019 | 鲁棒RGB-D人脸识别

TPAMI 2019 | 鲁棒RGB-D人脸识别

计算机视觉life

11+阅读 · 2019年6月8日

【泡泡图灵智库】NM-Net：基于邻接点一致性的鲁邦特征点匹配（CVPR）

【泡泡图灵智库】NM-Net：基于邻接点一致性的鲁邦特征点匹配（CVPR）

泡泡机器人SLAM

35+阅读 · 2019年4月28日

【泡泡点云时空】基于选择性传感器融合的神经网络视觉里程计

【泡泡点云时空】基于选择性传感器融合的神经网络视觉里程计

泡泡机器人SLAM

18+阅读 · 2019年4月21日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

论文浅尝 | Global Relation Embedding for Relation Extraction

论文浅尝 | Global Relation Embedding for Relation Extraction

开放知识图谱

12+阅读 · 2019年3月3日

行为识别（action recognition）目前的难点在哪？

行为识别（action recognition）目前的难点在哪？

极市平台

36+阅读 · 2019年2月14日

【泡泡图灵智库】PointNet：用于三维分类与分割的点集深度学习（CVPR）

【泡泡图灵智库】PointNet：用于三维分类与分割的点集深度学习（CVPR）

泡泡机器人SLAM

11+阅读 · 2019年1月20日

相关基金

基于时空模式的复杂行为识别方法研究

国家自然科学基金

2+阅读 · 2017年12月31日

基于深层特征学习的RGB-D人体行为识别方法

国家自然科学基金

4+阅读 · 2015年12月31日

基于多任务稀疏学习的视频行为理解

国家自然科学基金

0+阅读 · 2014年12月31日

基于知识迁移的跨领域人体动作识别

国家自然科学基金

5+阅读 · 2013年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

面向RGB-D视频的人体动作识别研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于深度和图像融合的特征判别模型及其在形变三维人脸跟踪中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

车载激光扫描点云与全景影像的高精度配准方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于时空流形学习与概率图模型的人体动作识别

国家自然科学基金

2+阅读 · 2012年12月31日

视频数据空间化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

相关论文

THORN: Temporal Human-Object Relation Network for Action Recognition

THORN: Temporal Human-Object Relation Network for Action Recognition

Arxiv

0+阅读 · 2022年4月20日

A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

Arxiv

0+阅读 · 2022年4月20日

Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing

Arxiv

1+阅读 · 2022年4月19日

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Arxiv

0+阅读 · 2022年4月19日

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

Arxiv

0+阅读 · 2022年4月19日

TSception: Capturing Temporal Dynamics and Spatial Asymmetry from EEG for Emotion Recognition

Arxiv

0+阅读 · 2022年4月18日

End-to-end Weakly-supervised Multiple 3D Hand Mesh Reconstruction from Single Image

Arxiv

0+阅读 · 2022年4月18日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

Learning Conceptual-Contexual Embeddings for Medical Text

Arxiv

27+阅读 · 2019年8月16日

微信扫码咨询专知VIP会员