视频活动在时间边界带有不确定性的地方化 (Video Activity Localisation with Uncertainties in Temporal Boundary) - 专知论文

会员服务 ·

0

Attention · 相关系数 · Extensibility · MoDELS · 标注 ·

2022 年 6 月 26 日

Video Activity Localisation with Uncertainties in Temporal Boundary

翻译：视频活动在时间边界带有不确定性的地方化

Jiabo Huang,Hailin Jin,Shaogang Gong,Yang Liu

Current methods for video activity localisation over time assume implicitly that activity temporal boundaries labelled for model training are determined and precise. However, in unscripted natural videos, different activities mostly transit smoothly, so that it is intrinsically ambiguous to determine in labelling precisely when an activity starts and ends over time. Such uncertainties in temporal labelling are currently ignored in model training, resulting in learning mis-matched video-text correlation with poor generalisation in test. In this work, we solve this problem by introducing Elastic Moment Bounding (EMB) to accommodate flexible and adaptive activity temporal boundaries towards modelling universally interpretable video-text correlation with tolerance to underlying temporal uncertainties in pre-fixed annotations. Specifically, we construct elastic boundaries adaptively by mining and discovering frame-wise temporal endpoints that can maximise the alignment between video segments and query sentences. To enable both more robust matching (segment content attention) and more accurate localisation (segment elastic boundaries), we optimise the selection of frame-wise endpoints subject to segment-wise contents by a novel Guided Attention mechanism. Extensive experiments on three video activity localisation benchmarks demonstrate compellingly the EMB's advantages over existing methods without modelling uncertainty.

翻译：目前视频活动本地化的方法随着时间的推移而假定活动的时间界限是确定和精确的。然而,在未标定的自然视频中,不同的活动大多是顺畅的,因此在标签上确定精确的某一活动开始和结束的时间和时间的长短时,在本质上是模棱两可的。目前,在模拟培训中忽略了这种时间标签的不确定性,导致学习与测试中一般化程度差的不相称的视频文本相关性。在这项工作中,我们通过引入弹性超强的超强和适应性的活动时间界限(EMB)来解决这个问题,以模拟通用可解释的视频文本相关性,并容忍在预先固定的描述中潜在的时间不确定性。具体地说,我们通过采矿和发现框架化的时间端点来建立弹性界限,从而可以使视频段段和查询句之间的一致最大化。为了能够更强有力地匹配(对内容的注意)和更精确的本地化(对缩放弹性边界),我们选择选择通过新式导导引线机制选择可分段内容的框架偏向端端端点。在三个视频活动定位基准上进行广泛的实验,而没有可靠的模型显示EMM的优势。

0

相关内容

Attention

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

随机偏微分方程

国家自然科学基金

5+阅读 · 2017年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

夹杂物处疲劳裂纹萌生寿命的多尺度预测方法

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

非监督高光谱图像实时目标探测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

有限长区域中的空间耦合多元Rateless码研究

国家自然科学基金

0+阅读 · 2012年12月31日

针尖石墨烯纳米场效应晶体管生物传感器的研究

国家自然科学基金

0+阅读 · 2012年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

MITL Verification Under Timing Uncertainty

Arxiv

0+阅读 · 2022年8月16日

Temporal Action Localization with Multi-temporal Scales

Arxiv

0+阅读 · 2022年8月16日

UniMiSS: Universal Medical Self-Supervised Learning via Breaking Dimensionality Barrier

Arxiv

0+阅读 · 2022年8月16日

Point is a Vector: A Feature Representation in Point Analysis

Point is a Vector: A Feature Representation in Point Analysis

Arxiv

0+阅读 · 2022年8月15日

Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Arxiv

0+阅读 · 2022年8月12日

Estimating the Uncertainty in Emotion Class Labels with Utterance-Specific Dirichlet Priors

Arxiv

0+阅读 · 2022年8月12日

Triple-View Feature Learning for Medical Image Segmentation

Arxiv

1+阅读 · 2022年8月12日

Exploiting Feature Diversity for Make-up Temporal Video Grounding

Exploiting Feature Diversity for Make-up Temporal Video Grounding

Arxiv

0+阅读 · 2022年8月12日

Unifying local and global model explanations by functional decomposition of low dimensional structures

Arxiv

0+阅读 · 2022年8月12日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】迈向原则化的强化学习

中文版4000字 | 无人机赋能步兵实现超视距打击

CVPR2025 | ODE：多模态大语言模型幻觉的开集动态评估框架

具身智能学习综述：基于物理模拟器与世界模型的方法

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

MITL Verification Under Timing Uncertainty

Arxiv

0+阅读 · 2022年8月16日

Temporal Action Localization with Multi-temporal Scales

Arxiv

0+阅读 · 2022年8月16日

UniMiSS: Universal Medical Self-Supervised Learning via Breaking Dimensionality Barrier

Arxiv

0+阅读 · 2022年8月16日

Point is a Vector: A Feature Representation in Point Analysis

Point is a Vector: A Feature Representation in Point Analysis

Arxiv

0+阅读 · 2022年8月15日

Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Arxiv

0+阅读 · 2022年8月12日

Estimating the Uncertainty in Emotion Class Labels with Utterance-Specific Dirichlet Priors

Arxiv

0+阅读 · 2022年8月12日

Triple-View Feature Learning for Medical Image Segmentation

Arxiv

1+阅读 · 2022年8月12日

Exploiting Feature Diversity for Make-up Temporal Video Grounding

Exploiting Feature Diversity for Make-up Temporal Video Grounding

Arxiv

0+阅读 · 2022年8月12日

Unifying local and global model explanations by functional decomposition of low dimensional structures

Arxiv

0+阅读 · 2022年8月12日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

相关基金

随机偏微分方程

国家自然科学基金

5+阅读 · 2017年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

夹杂物处疲劳裂纹萌生寿命的多尺度预测方法

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

非监督高光谱图像实时目标探测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

有限长区域中的空间耦合多元Rateless码研究

国家自然科学基金

0+阅读 · 2012年12月31日

针尖石墨烯纳米场效应晶体管生物传感器的研究

国家自然科学基金

0+阅读 · 2012年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员