Recent contrastive learning methods have shown to be effective in various tasks, learning generalizable representations invariant to data augmentation thereby leading to state of the art performances. Regarding the multifaceted nature of large unlabeled data used in self-supervised learning while majority of real-word downstream tasks use single format of data, a multimodal framework that can train single modality to learn diverse perspectives from other modalities is an important challenge. In this paper, we propose TriCL (Triangular Contrastive Learning), a universal framework for trimodal contrastive learning. TriCL takes advantage of Triangular Area Loss, a novel intermodal contrastive loss that learns the angular geometry of the embedding space through simultaneously contrasting the area of positive and negative triplets. Systematic observation on embedding space in terms of alignment and uniformity showed that Triangular Area Loss can address the line-collapsing problem by discriminating modalities by angle. Our experimental results also demonstrate the outperformance of TriCL on downstream task of molecular property prediction which implies that the advantages of the embedding space indeed benefits the performance on downstream tasks.
翻译:最近对比式学习方法显示,在各种任务中行之有效,学习了可概括化的表达方式,以扩大数据,从而导致艺术表现的状态。关于在自我监督的学习中使用大型无标签数据,而大多数实战下游任务使用单一的数据格式,一个能够培训单一模式以从其他模式中学习不同观点的多式联运框架是一项重大挑战。在本文件中,我们提议TriCL(三角对立学习),这是一个三模式对比性学习的普遍框架。TriCL利用了三角区域损失,这是一种新型的联运对比性损失,通过同时对比正负三重空间领域来学习嵌入空间的角形几何性。系统观测表明,三角区域损失可以通过从角度区分模式来解决线重叠问题。我们的实验结果还表明,TriCLL在分子特性预测的下游任务方面表现优异,这意味着嵌入空间的优势确实有利于下游任务的业绩。