Emotional states manifest as coordinated yet heterogeneous physiological responses across central and autonomic systems, posing a fundamental challenge for multimodal representation learning in affective computing. Learning such joint dynamics is further complicated by the scarcity and subjectivity of affective annotations, which motivates the use of self-supervised learning (SSL). However, most existing SSL approaches rely on pairwise alignment objectives, which are insufficient to characterize dependencies among more than two modalities and fail to capture higher-order interactions arising from coordinated brain and autonomic responses. To address this limitation, we propose Multimodal Functional Maximum Correlation (MFMC), a principled SSL framework that maximizes higher-order multimodal dependence through a Dual Total Correlation (DTC) objective. By deriving a tight sandwich bound and optimizing it using a functional maximum correlation analysis (FMCA) based trace surrogate, MFMC captures joint multimodal interactions directly, without relying on pairwise contrastive losses. Experiments on three public affective computing benchmarks demonstrate that MFMC consistently achieves state-of-the-art or competitive performance under both subject-dependent and subject-independent evaluation protocols, highlighting its robustness to inter-subject variability. In particular, MFMC improves subject-dependent accuracy on CEAP-360VR from 78.9% to 86.8%, and subject-independent accuracy from 27.5% to 33.1% using the EDA signal alone. Moreover, MFMC remains within 0.8 percentage points of the best-performing method on the most challenging EEG subject-independent split of MAHNOB-HCI. Our code is available at https://github.com/DY9910/MFMC.
翻译:情感状态表现为中枢与自主神经系统间协调但异质的生理反应,这对情感计算中的多模态表征学习提出了根本性挑战。由于情感标注数据的稀缺性和主观性,学习此类联合动态特性变得更为复杂,这促使了自监督学习(SSL)方法的应用。然而,现有大多数SSL方法依赖于成对对齐目标,这些目标不足以刻画两个以上模态间的依赖关系,且无法捕捉由协调的大脑与自主神经反应产生的高阶交互作用。为突破这一局限,我们提出多模态功能最大相关性(MFMC),这是一个基于双重总相关(DTC)目标、通过最大化高阶多模态依赖性的原则性SSL框架。通过推导紧密的夹逼界,并利用基于功能最大相关分析(FMCA)的迹代理进行优化,MFMC能够直接捕捉联合多模态交互作用,而无需依赖成对对比损失。在三个公开情感计算基准测试上的实验表明,无论在受试者依赖还是受试者独立的评估协议下,MFMC均能持续取得最先进或具有竞争力的性能,凸显了其对受试者间变异性的鲁棒性。具体而言,仅使用EDA信号时,MFMC将CEAP-360VR数据集上的受试者依赖准确率从78.9%提升至86.8%,受试者独立准确率从27.5%提升至33.1%。此外,在MAHNOB-HCI最具挑战性的EEG受试者独立划分上,MFMC与最佳性能方法的差距保持在0.8个百分点以内。我们的代码公开于https://github.com/DY9910/MFMC。