通过 " 飞上梯度变换 " 实现平衡的多模式学习 (Balanced Multimodal Learning via On-the-fly Gradient Modulation) - 专知论文

会员服务 ·

0

多峰值 · 多模态学习 · 学成 · 模态 · Boosting（一种模型训练加速方式） ·

2022 年 3 月 29 日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

翻译：通过 " 飞上梯度变换 " 实现平衡的多模式学习

Xiaokang Peng,Yake Wei,Andong Deng,Dong Wang,Di Hu

from arxiv, Accepted by CVPR 2022 (ORAL)

Multimodal learning helps to comprehensively understand the world, by integrating different senses. Accordingly, multiple input modalities are expected to boost model performance, but we actually find that they are not fully exploited even when the multimodal model outperforms its uni-modal counterpart. Specifically, in this paper we point out that existing multimodal discriminative models, in which uniform objective is designed for all modalities, could remain under-optimized uni-modal representations, caused by another dominated modality in some scenarios, e.g., sound in blowing wind event, vision in drawing picture event, etc. To alleviate this optimization imbalance, we propose on-the-fly gradient modulation to adaptively control the optimization of each modality, via monitoring the discrepancy of their contribution towards the learning objective. Further, an extra Gaussian noise that changes dynamically is introduced to avoid possible generalization drop caused by gradient modulation. As a result, we achieve considerable improvement over common fusion methods on different multimodal tasks, and this simple strategy can also boost existing multimodal methods, which illustrates its efficacy and versatility. The source code is available at \url{https://github.com/GeWu-Lab/OGM-GE_CVPR2022}.

翻译：因此,多种投入模式有望提升模型性能,但我们实际上发现,即使多式联运模式优于单式对等模式,也并未充分利用这些模式。具体地说,在本文件中,我们指出,现有多式联运歧视模式,即所有模式的统一目标是设计出统一目标的,因此,由于在某些情形中另一种主导模式,例如吹风事件的声音、图片事件中的视觉等,这些模式有助于全面理解世界。为了减轻这种优化不平衡,我们提议在飞行梯度上调整,以适应性地控制每种模式的优化,办法是通过监测其对学习目标的贡献的差异。此外,动态引入额外高音,以避免因梯度调整而可能造成的普遍化下降。结果就是,我们对不同多式联运任务的共同融合方法有了相当大的改进,而这种简单战略也能推动现有的多式联运方法,说明其功效和多变性。源代码可在\urla-PR20/M22_GOG_GB_G_GB_G_Giorg_G_G_GUB_G_GUB_G_GQ_GQ_GIScom提供源代码。

13

相关内容

多峰值

【CVPR 2022】通过动态梯度调制平衡视听学习，Balanced Audio-visual Learning via On-the-fly Gradient Modulation

【CVPR 2022】通过动态梯度调制平衡视听学习，Balanced Audio-visual Learning via On-the-fly Gradient Modulation

专知会员服务

8+阅读 · 2022年3月12日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

38+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

109+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

57+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

77+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

99+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

2+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

26+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

MDSCs调控piRNA介导DNA甲基化参与骨髓瘤干细胞形成及耐药的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

Survivin在低氧诱导喉癌淋巴管生成中的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于复杂网络理论的2型糖尿病中医证治规律及其机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

FGF-1及其 3'UTR区SNP多态性与噪声性听力损失关系及机制的研究

国家自然科学基金

0+阅读 · 2014年12月31日

靶向肿瘤干细胞治疗肝癌的多模态影像研究

国家自然科学基金

0+阅读 · 2014年12月31日

BRCA1蛋白出核的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

清燥救肺汤分解剂干预MP感染LAMPs模式识别及其调控通路的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

复杂疾病中的若干统计方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

病理性疼痛调节的新靶点-脊髓背角星形胶质细胞糖皮质激素受体

国家自然科学基金

0+阅读 · 2009年12月31日

Learning Trajectory-Aware Transformer for Video Super-Resolution

Arxiv

0+阅读 · 2022年4月20日

Modality-Balanced Embedding for Video Retrieval

Arxiv

0+阅读 · 2022年4月18日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

43+阅读 · 2022年4月16日

Text Revision by On-the-Fly Representation Optimization

Arxiv

0+阅读 · 2022年4月15日

Bayesian Deep Learning for Graphs

Arxiv

21+阅读 · 2022年2月24日

Multimodality in Meta-Learning: A Comprehensive Survey

Arxiv

37+阅读 · 2021年9月28日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Arxiv

56+阅读 · 2021年5月24日

Deep Learning on Image Denoising: An overview

Arxiv

11+阅读 · 2020年8月3日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

多模态学习

Boosting（一种模型训练加速方式）

相关VIP内容

【CVPR 2022】通过动态梯度调制平衡视听学习，Balanced Audio-visual Learning via On-the-fly Gradient Modulation

【CVPR 2022】通过动态梯度调制平衡视听学习，Balanced Audio-visual Learning via On-the-fly Gradient Modulation

专知会员服务

8+阅读 · 2022年3月12日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

38+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

109+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

57+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

77+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

99+阅读 · 2019年10月9日

热门VIP内容

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

2+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

26+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Learning Trajectory-Aware Transformer for Video Super-Resolution

Arxiv

0+阅读 · 2022年4月20日

Modality-Balanced Embedding for Video Retrieval

Arxiv

0+阅读 · 2022年4月18日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

43+阅读 · 2022年4月16日

Text Revision by On-the-Fly Representation Optimization

Arxiv

0+阅读 · 2022年4月15日

Bayesian Deep Learning for Graphs

Arxiv

21+阅读 · 2022年2月24日

Multimodality in Meta-Learning: A Comprehensive Survey

Arxiv

37+阅读 · 2021年9月28日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Arxiv

56+阅读 · 2021年5月24日

Deep Learning on Image Denoising: An overview

Arxiv

11+阅读 · 2020年8月3日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

相关基金

MDSCs调控piRNA介导DNA甲基化参与骨髓瘤干细胞形成及耐药的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

Survivin在低氧诱导喉癌淋巴管生成中的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于复杂网络理论的2型糖尿病中医证治规律及其机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

FGF-1及其 3'UTR区SNP多态性与噪声性听力损失关系及机制的研究

国家自然科学基金

0+阅读 · 2014年12月31日

靶向肿瘤干细胞治疗肝癌的多模态影像研究

国家自然科学基金

0+阅读 · 2014年12月31日

BRCA1蛋白出核的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

清燥救肺汤分解剂干预MP感染LAMPs模式识别及其调控通路的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

复杂疾病中的若干统计方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

病理性疼痛调节的新靶点-脊髓背角星形胶质细胞糖皮质激素受体

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员