Multi-view clustering (MVC) optimally integrates complementary information from different views to improve clustering performance. Although demonstrating promising performance in various applications, most of existing approaches directly fuse multiple pre-specified similarities to learn an optimal similarity matrix for clustering, which could cause over-complicated optimization and intensive computational cost. In this paper, we propose late fusion MVC via alignment maximization to address these issues. To do so, we first reveal the theoretical connection of existing k-means clustering and the alignment between base partitions and the consensus one. Based on this observation, we propose a simple but effective multi-view algorithm termed LF-MVC-GAM. It optimally fuses multiple source information in partition level from each individual view, and maximally aligns the consensus partition with these weighted base ones. Such an alignment is beneficial to integrate partition level information and significantly reduce the computational complexity by sufficiently simplifying the optimization procedure. We then design another variant, LF-MVC-LAM to further improve the clustering performance by preserving the local intrinsic structure among multiple partition spaces. After that, we develop two three-step iterative algorithms to solve the resultant optimization problems with theoretically guaranteed convergence. Further, we provide the generalization error bound analysis of the proposed algorithms. Extensive experiments on eighteen multi-view benchmark datasets demonstrate the effectiveness and efficiency of the proposed LF-MVC-GAM and LF-MVC-LAM, ranging from small to large-scale data items. The codes of the proposed algorithms are publicly available at https://github.com/wangsiwei2010/latefusionalignment.
翻译:多观点群集(MVC)最佳地整合了不同观点的补充信息,以提高群集绩效。尽管在各种应用中,大多数现有办法都展示了有希望的绩效,但大多数现有办法都直接结合了多个预设前的相似之处,以学习一个最佳的群集类似矩阵,这可能造成过于复杂的优化和密集的计算成本。在本文件中,我们提议通过优化最大化来解决这些问题,延迟聚集MVC(MVC),从而大大降低计算的复杂性。为了做到这一点,我们首先披露现有的 k-poors 群集的理论联系以及基础分区和共识之间的匹配。基于这一观察,我们提议了一个简单而有效的多视图算法,称为LF-MPC-GAM(LM-GAM)。它最优化地结合了每个组合层层层的多个源信息,并将共识分配与这些加权基数进行最大程度的组合。这样的组合有利于整合,通过充分简化优化程序,大大降低计算的复杂性。我们随后设计了另一个变式,即LF-MVC-LAM-LAM-AM(LAM-LAM-LAM),通过维护多个分区的本地内部结构结构结构来进一步改进。之后,我们进一步开发了两个阶段的迭代迭代迭代迭代次的迭代迭代代代代代次的代代代代代算,以解决了对结果的大规模数据效率分析。我们测测测测测测测测测测测。