Given a data matrix $\mathbf{A} \in \mathbb{R}^{n \times d}$, principal component projection (PCP) and principal component regression (PCR), i.e. projection and regression restricted to the top-eigenspace of $\mathbf{A}$, are fundamental problems in machine learning, optimization, and numerical analysis. In this paper we provide the first algorithms that solve these problems in nearly linear time for fixed eigenvalue distribution and large n. This improves upon previous methods which have superlinear running times when both the number of top eigenvalues and inverse gap between eigenspaces is large. We achieve our results by applying rational approximations to reduce PCP and PCR to solving asymmetric linear systems which we solve by a variant of SVRG. We corroborate these findings with preliminary empirical experiments.
翻译:根据一个数据矩阵 $\ mathbf{A}\ in\ mathbb{R ⁇ n\ times d}$,主要组成部分预测和主要组成部分回归(PCR),即限于顶层天体的预测和回归($\mathbf{A}$),是机器学习、优化和数字分析的根本问题。在本文中,我们提供了在固定电子值分布和大数值分布的近线性时间里解决这些问题的第一种算法。在顶层电子值和顶层天体之间逆差都很大的情况下,这些方法具有超线性运行时间。我们通过使用合理近似法减少五氯苯酚和多光子体,以解决我们通过SVRG变量解决的不对称线性系统,从而取得了结果。我们用初步的经验实验证实了这些结果。