双向内核矩阵表穿孔:转向资源效率高的五氯苯甲醚和光谱集群 (Two-way kernel matrix puncturing: towards resource-efficient PCA and spectral clustering) - 专知论文

会员服务 ·

0

簇 · 核矩阵 · PCA · 核化 · Storage ·

2021 年 5 月 17 日

Two-way kernel matrix puncturing: towards resource-efficient PCA and spectral clustering

翻译：双向内核矩阵表穿孔:转向资源效率高的五氯苯甲醚和光谱集群

Romain Couillet,Florent Chatelain,Nicolas Le Bihan

from arxiv, 24 pages (10 for the core paper, 14 for the proofs in supplementary materials) , 10 figures. Final version to be published in ICML 2021 proceedings

The article introduces an elementary cost and storage reduction method for spectral clustering and principal component analysis. The method consists in randomly "puncturing" both the data matrix $X\in\mathbb{C}^{p\times n}$ (or $\mathbb{R}^{p\times n}$) and its corresponding kernel (Gram) matrix $K$ through Bernoulli masks: $S\in\{0,1\}^{p\times n}$ for $X$ and $B\in\{0,1\}^{n\times n}$ for $K$. The resulting "two-way punctured" kernel is thus given by $K=\frac{1}{p}[(X \odot S)^{\sf H} (X \odot S)] \odot B$. We demonstrate that, for $X$ composed of independent columns drawn from a Gaussian mixture model, as $n,p\to\infty$ with $p/n\to c_0\in(0,\infty)$, the spectral behavior of $K$ -- its limiting eigenvalue distribution, as well as its isolated eigenvalues and eigenvectors -- is fully tractable and exhibits a series of counter-intuitive phenomena. We notably prove, and empirically confirm on GAN-generated image databases, that it is possible to drastically puncture the data, thereby providing possibly huge computational and storage gains, for a virtually constant (clustering of PCA) performance. This preliminary study opens as such the path towards rethinking, from a large dimensional standpoint, computational and storage costs in elementary machine learning models.

翻译：文章为光谱集和主元件分析引入了基本成本和存储削减方法。方法包括随机“ 跳动” 数据矩阵 $X\ in\ mathbb{C\ p\ p\time n} $ (或$\ mathb{R\ p\ time n} $) 及其相应的内核( gram) 矩阵 $K$ (通过 Bernoulli 面罩 : $S\ 10, 1\\ p\ time n} 美元, 美元, 美元, 美元, 0. 0, 1\ n\\ n\ f time n} 美元。因此, 由此产生的“ 双向双向双向双向崩溃的” 内核内核内核( tway puncrec{1\ p} [ (x\\\ odobot S) hitlemmmission) 。我们证明, $X$, p\\\\\\\ in pretimeal_deal deal deal deal deal deal deal as a missional.

0

相关内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

197+阅读 · 2019年12月19日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

5+阅读 · 2019年4月29日

详解GAN的谱归一化（Spectral Normalization）

详解GAN的谱归一化（Spectral Normalization）

PaperWeekly

11+阅读 · 2019年2月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Probabilistic semi-nonnegative matrix factorization: a Skellam-based framework

Arxiv

0+阅读 · 2021年7月7日

Distance Estimation Between Unknown Matrices Using Sublinear Projections on Hamming Cube

Arxiv

0+阅读 · 2021年7月6日

On the Hardness of Compressing Weights

Arxiv

0+阅读 · 2021年7月6日

Variance Reduction for Matrix Computations with Applications to Gaussian Processes

Arxiv

0+阅读 · 2021年7月6日

An $\ell_p$ theory of PCA and spectral clustering

Arxiv

0+阅读 · 2021年7月5日

Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering

Arxiv

0+阅读 · 2021年7月5日

Gaussian graphical modeling for spectrometric data analysis

Arxiv

0+阅读 · 2021年7月3日

Near-linear convergence of the Random Osborne algorithm for Matrix Balancing

Arxiv

0+阅读 · 2021年7月2日

Graph Signal Processing -- Part I: Graphs, Graph Spectra, and Spectral Clustering

Arxiv

14+阅读 · 2019年8月12日

Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification

Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification

Arxiv

4+阅读 · 2018年10月11日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

197+阅读 · 2019年12月19日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

5+阅读 · 2019年4月29日

详解GAN的谱归一化（Spectral Normalization）

详解GAN的谱归一化（Spectral Normalization）

PaperWeekly

11+阅读 · 2019年2月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Probabilistic semi-nonnegative matrix factorization: a Skellam-based framework

Arxiv

0+阅读 · 2021年7月7日

Distance Estimation Between Unknown Matrices Using Sublinear Projections on Hamming Cube

Arxiv

0+阅读 · 2021年7月6日

On the Hardness of Compressing Weights

Arxiv

0+阅读 · 2021年7月6日

Variance Reduction for Matrix Computations with Applications to Gaussian Processes

Arxiv

0+阅读 · 2021年7月6日

An $\ell_p$ theory of PCA and spectral clustering

Arxiv

0+阅读 · 2021年7月5日

Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering

Arxiv

0+阅读 · 2021年7月5日

Gaussian graphical modeling for spectrometric data analysis

Arxiv

0+阅读 · 2021年7月3日

Near-linear convergence of the Random Osborne algorithm for Matrix Balancing

Arxiv

0+阅读 · 2021年7月2日

Graph Signal Processing -- Part I: Graphs, Graph Spectra, and Spectral Clustering

Arxiv

14+阅读 · 2019年8月12日

Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification

Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification

Arxiv

4+阅读 · 2018年10月11日

微信扫码咨询专知VIP会员