Estimating graphical model structure from high-dimensional and undersampled data is a fundamental problem in many scientific fields. Existing approaches, such as GLASSO, latent variable GLASSO, and latent tree models, suffer from high computational complexity and may impose unrealistic sparsity priors in some cases. We introduce a novel method that leverages a newly discovered connection between information-theoretic measures and structured latent factor models to derive an optimization objective which encourages modular structures where each observed variable has a single latent parent. The proposed method has linear stepwise computational complexity w.r.t. the number of observed variables. Our experiments on synthetic data demonstrate that our approach is the only method that recovers modular structure better as the dimensionality increases. We also use our approach for estimating covariance structure for a number of real-world datasets and show that it consistently outperforms state-of-the-art estimators at a fraction of the computational cost. Finally, we apply the proposed method to high-resolution fMRI data (with more than 10^5 voxels) and show that it is capable of extracting meaningful patterns.
翻译:现有方法,如GLASSO、潜伏变量GLASSO和潜树模型,都具有很高的计算复杂性,在某些情况下可能会造成不切实际的宽度前科。我们采用了一种新颖的方法,利用新发现的信息-理论计量和结构化潜伏系数模型之间的联系,得出一个优化目标,鼓励每个观测到的变量都有单一潜值母体的模块结构。拟议方法具有所观测到的变量数量的线性分步计算复杂性。我们关于合成数据的实验表明,我们的方法是唯一随着维度的提高而使模块结构恢复得更好的方法。我们还利用我们的方法来估计一些真实世界数据集的共变结构,并表明它始终低于计算成本的一小部分。最后,我们将拟议方法应用于高分辨率的FMRI数据(超过10°5 voxels),并表明它能够提取有意义的模式。