Persistent homology is a cornerstone of topological data analysis, offering a multiscale summary of topology with robustness to nuisance transformations, such as rotations and small deformations. Persistent homology has seen broad use across domains such as computer vision and neuroscience. Most statistical treatments, however, use homology primarily as a feature extractor, relying on statistical distance-based tests or simple time-to-event models for inferential tasks. While these approaches can detect global differences, they rarely localize the source of those differences. We address this gap by taking a graphical model-based approach: we associate each vertex with a population latent position in a conic space and model each bar's key events (birth and death times) using an exponential distribution, whose rate is a transformation of the latent positions according to an event occurring on the graph. The low-dimensional bars have simple graph-event representations, such as the formation of a minimum spanning tree or the triangulation of a loop, and thus enjoy tractable likelihoods. Taking a Bayesian approach, we infer latent positions and enable model extensions such as hierarchical models that allow borrowing strength across groups. Applications to a neuroimaging study of Alzheimer's disease demonstrate that our method localizes sources of difference and provides interpretable, model-based analyses of topological structure in complex data. The code is provided and maintained at https://github.com/zitianwu/graphPH.
翻译:持久同调是拓扑数据分析的基石,它提供了拓扑的多尺度摘要,并对旋转和小变形等干扰变换具有鲁棒性。持久同调在计算机视觉和神经科学等领域得到了广泛应用。然而,大多数统计处理方法主要将同调用作特征提取器,依赖于基于统计距离的检验或简单的事件时间模型进行推断任务。虽然这些方法能够检测全局差异,但很少能定位差异的来源。我们通过采用基于图模型的方法来解决这一差距:将每个顶点与圆锥空间中的群体潜在位置相关联,并使用指数分布对每个条形码的关键事件(出生时间和死亡时间)进行建模,其速率是根据图上发生的事件对潜在位置进行变换得到的。低维条形码具有简单的图-事件表示,例如最小生成树的形成或环的三角剖分,因此具有易于处理的似然函数。采用贝叶斯方法,我们推断潜在位置,并支持模型扩展,例如允许跨组借力的层次模型。在阿尔茨海默病的神经影像研究中应用表明,我们的方法能够定位差异来源,并对复杂数据中的拓扑结构提供可解释的、基于模型的分析。代码已在 https://github.com/zitianwu/graphPH 提供并维护。