Existing self-supervised contrastive learning methods for skeleton-based action recognition often process all skeleton regions uniformly, and adopt a first-in-first-out (FIFO) queue to store negative samples, which leads to motion information loss and non-optimal negative sample selection. To address these challenges, this paper proposes Dominance-Game Contrastive Learning network for skeleton-based action Recognition (DoGCLR), a self-supervised framework based on game theory. DoGCLR models the construction of positive and negative samples as a dynamic Dominance Game, where both sample types interact to reach an equilibrium that balances semantic preservation and discriminative strength. Specifically, a spatio-temporal dual weight localization mechanism identifies key motion regions and guides region-wise augmentations to enhance motion diversity while maintaining semantics. In parallel, an entropy-driven dominance strategy manages the memory bank by retaining high entropy (hard) negatives and replacing low-entropy (weak) ones, ensuring consistent exposure to informative contrastive signals. Extensive experiments are conducted on NTU RGB+D and PKU-MMD datasets. On NTU RGB+D 60 X-Sub/X-View, DoGCLR achieves 81.1%/89.4% accuracy, and on NTU RGB+D 120 X-Sub/X-Set, DoGCLR achieves 71.2%/75.5% accuracy, surpassing state-of-the-art methods by 0.1%, 2.7%, 1.1%, and 2.3%, respectively. On PKU-MMD Part I/Part II, DoGCLR performs comparably to the state-of-the-art methods and achieves a 1.9% higher accuracy on Part II, highlighting its strong robustness on more challenging scenarios.
翻译:现有的基于骨架动作识别的自监督对比学习方法通常对所有骨架区域进行统一处理,并采用先进先出(FIFO)队列存储负样本,这会导致运动信息丢失和负样本选择非最优。为应对这些挑战,本文提出一种基于博弈论的自监督框架——用于骨架动作识别的支配博弈对比学习网络(DoGCLR)。DoGCLR将正负样本的构建建模为一个动态支配博弈,其中两类样本通过交互达到平衡,兼顾语义保持与判别力增强。具体而言,一个时空双权重定位机制识别关键运动区域,并指导区域级数据增强以在保持语义的同时提升运动多样性。同时,一种熵驱动的支配策略管理记忆库,保留高熵(困难)负样本并替换低熵(薄弱)负样本,确保持续接触信息丰富的对比信号。我们在NTU RGB+D和PKU-MMD数据集上进行了大量实验。在NTU RGB+D 60的X-Sub/X-View设置下,DoGCLR分别达到81.1%/89.4%的准确率;在NTU RGB+D 120的X-Sub/X-Set设置下,分别达到71.2%/75.5%的准确率,较现有最优方法分别提升0.1%、2.7%、1.1%和2.3%。在PKU-MMD Part I/Part II上,DoGCLR与最优方法性能相当,且在Part II上准确率高出1.9%,突显了其在更具挑战性场景下的强鲁棒性。