Domain adaptation (DA) approaches address domain shift and enable networks to be applied to different scenarios. Although various image DA approaches have been proposed in recent years, there is limited research towards video DA. This is partly due to the complexity in adapting the different modalities of features in videos, which includes the correlation features extracted as long-term dependencies of pixels across spatiotemporal dimensions. The correlation features are highly associated with action classes and proven their effectiveness in accurate video feature extraction through the supervised action recognition task. Yet correlation features of the same action would differ across domains due to domain shift. Therefore we propose a novel Adversarial Correlation Adaptation Network (ACAN) to align action videos by aligning pixel correlations. ACAN aims to minimize the distribution of correlation information, termed as Pixel Correlation Discrepancy (PCD). Additionally, video DA research is also limited by the lack of cross-domain video datasets with larger domain shifts. We, therefore, introduce a novel HMDB-ARID dataset with a larger domain shift caused by a larger statistical difference between domains. This dataset is built in an effort to leverage current datasets for dark video classification. Empirical results demonstrate the state-of-the-art performance of our proposed ACAN for both existing and the new video DA datasets.
翻译:域适应(DA) 方法处理域变,使网络能够应用于不同的情景。虽然近年来提出了不同的图像 DA 方法,但是对视频 DA 的研究有限。这部分是由于调适视频中不同功能模式的复杂性,其中包括作为相像素的长期依赖性而生成的相干特征,这些相干特征包括:横跨空间的像素的长期依赖性。这些相关特征与行动类别高度相关,并证明它们通过监督的行动识别任务在精确视频特征提取方面的有效性。然而,由于域变,同一行动的关联性特点会因域变而不同。因此,我们提议建立一个新型的Aversarial Correlation 适应网络(ACAN), 以通过对像素相关性进行对齐来对动作视频视频视频视频进行匹配。 AcAN 旨在最大限度地减少关联性信息的传播,称为Pixel Correl Correl 相异性(PCD ) 。此外,视频DA的研究也因缺少跨部视频数据集而受到限制。因此,我们引入了一个新的HMDB-ARID 数据集,其域域变换更大规模,因为各域统计差异更大。这个数据集是用来利用当前图像的当前数据,用来显示DADARDADADA的当前数据。