分散的五氯苯甲醚的新基础 (A New Basis for Sparse PCA)

The statistical and computational performance of sparse principal component analysis (PCA) can be dramatically improved when the principal components are allowed to be sparse in a rotated eigenbasis. For this, we propose a new method for sparse PCA. In the simplest version of the algorithm, the component scores and loadings are initialized with a low-rank singular value decomposition. Then, the singular vectors are rotated with orthogonal rotations to make them approximately sparse. Finally, soft-thresholding is applied to the rotated singular vectors. This approach differs from prior approaches because it uses an orthogonal rotation to approximate a sparse basis. Our sparse PCA framework is versatile; for example, it extends naturally to the two-way analysis of a data matrix for simultaneous dimensionality reduction of rows and columns. We identify the close relationship between sparse PCA and independent component analysis for separating sparse signals. We provide empirical evidence showing that for the same level of sparsity, the proposed sparse PCA method is more stable and can explain more variance compared to alternative methods. Through three applications---sparse coding of images, analysis of transcriptome sequencing data, and large-scale clustering of Twitter accounts, we demonstrate the usefulness of sparse PCA in exploring modern multivariate data.

翻译：当允许主要组成部分在旋转的单质基质中稀散时,稀少主元组成部分分析(PCA)的统计和计算性能可以大为改善。为此,我们为稀散的五氯苯甲醚提出一种新的方法。在最简单的算法版本中, 组件分数和装载的初始化为低级单值分解分解。然后, 单向矢量以正态旋转旋转方式旋转, 使其大致稀散。最后, 对旋转的单向矢量应用软高度保持方法。这个方法与以前的方法不同, 因为它使用一种正方位旋转, 以近似稀散的基础。我们稀散的五氯苯甲醚框架是多功能的; 例如, 它自然延伸至对数据矩阵的双向分析, 用于同时减少行和列的维度。我们确定稀散的五氯苯甲醚和独立部件分析之间的密切关系, 以区分稀散的信号。我们提供经验证据, 表明对于同样的宽度, 拟议的稀散的五氯苯甲醚方法比较稳定, 并且能够解释与替代的方法相比更多的差异。通过三种应用的图像的分解编码, 分析, 稀散的调的图像分析, 微质的调制的调制数据, 数据组合中我们展示了我们的数据。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

27+阅读 · 2020年6月13日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

37+阅读 · 2020年5月30日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

113+阅读 · 2020年4月5日