为什么全球共产物联营中最接近的矩阵广场根底外貌表现准确的SVD? (Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling?) - 专知论文

会员服务 ·

0

奇异值分解 · Performer · 方阵 · 近似 · 汇聚 ·

2021 年 5 月 6 日

Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling?

翻译：为什么全球共产物联营中最接近的矩阵广场根底外貌表现准确的SVD?

Yue Song,Nicu Sebe,Wei Wang

Global covariance pooling (GCP) aims at exploiting the second-order statistics of the convolutional feature. Its effectiveness has been demonstrated in boosting the classification performance of Convolutional Neural Networks (CNNs). Singular Value Decomposition (SVD) is used in GCP to compute the matrix square root. However, the approximate matrix square root calculated using Newton-Schulz iteration \cite{li2018towards} outperforms the accurate one computed via SVD \cite{li2017second}. We empirically analyze the reason behind the performance gap from the perspectives of data precision and gradient smoothness. Various remedies for computing smooth SVD gradients are investigated. Based on our observation and analyses, a hybrid training protocol is proposed for SVD-based GCP meta-layers such that competitive performances can be achieved against Newton-Schulz iteration. Moreover, we propose a new GCP meta-layer that uses SVD in the forward pass, and Pad\'e Approximants in the backward propagation to compute the gradients. The proposed meta-layer has been integrated into different CNN models and achieves state-of-the-art performances on both large-scale and fine-grained datasets.

翻译：GCP 用于计算矩阵平方根。然而,使用 Newton-Schulz 迭代计算出的大约矩阵平方根比通过 SVD\ cite{li2018towards} 计算的准确数字要好。我们从数据精确度和梯度平滑度的角度从数据精确度的角度对性能差距背后的原因进行了实验分析,对计算平滑的 SVD 梯度的各种补救措施进行了调查。根据我们的观察和分析,为基于 SVD 的 GCP 元层提出了混合培训协议,这样,就可以在牛顿-Schulz 梯度上实现竞争性性能。此外,我们提议一个新的GCP 元层,在前方通过 SVD 计算出来,在后方和高坡度平滑度上使用SVD,Pad\ 高级Approximants 在后方和高压级的SICISDA和高压度上,拟议在后方和低级的SISISDSBS上都实现了。

0

相关内容

奇异值分解

奇异值分解

奇异值分解（Singular Value Decomposition）是线性代数中一种重要的矩阵分解，奇异值分解则是特征分解在任意矩阵上的推广。在信号处理、统计学等领域有重要应用。

【TPAMI2021】鲁棒可微SVD，Robust Differentiable SVD

专知会员服务

23+阅读 · 2021年4月10日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

【经典书】线性代数元素，197页pdf

【经典书】线性代数元素，197页pdf

专知会员服务

56+阅读 · 2021年3月4日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML 2020】设置LayerNorm使Transformer加速收敛

专知会员服务

16+阅读 · 2020年7月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

专知会员服务

148+阅读 · 2019年12月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

【论文】【论文】王晓刚老师课题组ICCV2017论文：学习特征金字塔用于人体姿态估计（附代码）

【论文】【论文】王晓刚老师课题组ICCV2017论文：学习特征金字塔用于人体姿态估计（附代码）

机器学习研究会

6+阅读 · 2017年8月5日

A proximal-proximal majorization-minimization algorithm for nonconvex tuning-free robust regression problems

Arxiv

0+阅读 · 2021年6月25日

MARS: A second-order reduction algorithm for high-dimensional sparse precision matrices estimation

Arxiv

0+阅读 · 2021年6月25日

Distributed IDA-PBC for a Class of Nonholonomic Mechanical Systems

Arxiv

0+阅读 · 2021年6月24日

Regularisation for PCA- and SVD-type matrix factorisations

Regularisation for PCA- and SVD-type matrix factorisations

Arxiv

0+阅读 · 2021年6月24日

GNMR: A provable one-line algorithm for low rank matrix recovery

Arxiv

0+阅读 · 2021年6月24日

Nonlinear Matrix Approximation with Radial Basis Function Components

Arxiv

0+阅读 · 2021年6月23日

Higher Order Targeted Maximum Likelihood Estimation

Higher Order Targeted Maximum Likelihood Estimation

Arxiv

0+阅读 · 2021年6月23日

Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning

Arxiv

0+阅读 · 2021年6月23日

Robust Differentiable SVD

Arxiv

9+阅读 · 2021年4月8日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

VIP会员

文章信息

相关主题

奇异值分解

相关VIP内容

【TPAMI2021】鲁棒可微SVD，Robust Differentiable SVD

专知会员服务

23+阅读 · 2021年4月10日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

【经典书】线性代数元素，197页pdf

【经典书】线性代数元素，197页pdf

专知会员服务

56+阅读 · 2021年3月4日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML 2020】设置LayerNorm使Transformer加速收敛

专知会员服务

16+阅读 · 2020年7月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

【论文】深度学习的最优化:理论和算法（Optimization for deep learning: theory and algorithms）

专知会员服务

148+阅读 · 2019年12月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型驱动的AI智能体通信综述：协议、安全风险与防御对策

基于BERT和知识图谱的武器装备问答系统

中文版4000字 | 战场人工智能革命尚未到来：当前俄乌AI无人机发展现状

中文版 | 转向防务：硅谷如何谋划接管战争产业

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

【论文】【论文】王晓刚老师课题组ICCV2017论文：学习特征金字塔用于人体姿态估计（附代码）

【论文】【论文】王晓刚老师课题组ICCV2017论文：学习特征金字塔用于人体姿态估计（附代码）

机器学习研究会

6+阅读 · 2017年8月5日

相关论文

A proximal-proximal majorization-minimization algorithm for nonconvex tuning-free robust regression problems

Arxiv

0+阅读 · 2021年6月25日

MARS: A second-order reduction algorithm for high-dimensional sparse precision matrices estimation

Arxiv

0+阅读 · 2021年6月25日

Distributed IDA-PBC for a Class of Nonholonomic Mechanical Systems

Arxiv

0+阅读 · 2021年6月24日

Regularisation for PCA- and SVD-type matrix factorisations

Regularisation for PCA- and SVD-type matrix factorisations

Arxiv

0+阅读 · 2021年6月24日

GNMR: A provable one-line algorithm for low rank matrix recovery

Arxiv

0+阅读 · 2021年6月24日

Nonlinear Matrix Approximation with Radial Basis Function Components

Arxiv

0+阅读 · 2021年6月23日

Higher Order Targeted Maximum Likelihood Estimation

Higher Order Targeted Maximum Likelihood Estimation

Arxiv

0+阅读 · 2021年6月23日

Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning

Arxiv

0+阅读 · 2021年6月23日

Robust Differentiable SVD

Arxiv

9+阅读 · 2021年4月8日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

微信扫码咨询专知VIP会员