In recent research, adversarial attacks on person detectors using patches or static 3D model-based texture modifications have struggled with low success rates due to the flexible nature of human movement. Modeling the 3D deformations caused by various actions has been a major challenge. Fortunately, advancements in Neural Radiance Fields (NeRF) for dynamic human modeling offer new possibilities. In this paper, we introduce UV-Attack, a groundbreaking approach that achieves high success rates even with extensive and unseen human actions. We address the challenge above by leveraging dynamic-NeRF-based UV mapping. UV-Attack can generate human images across diverse actions and viewpoints, and even create novel actions by sampling from the SMPL parameter space. While dynamic NeRF models are capable of modeling human bodies, modifying clothing textures is challenging because they are embedded in neural network parameters. To tackle this, UV-Attack generates UV maps instead of RGB images and modifies the texture stacks. This approach enables real-time texture edits and makes the attack more practical. We also propose a novel Expectation over Pose Transformation loss (EoPT) to improve the evasion success rate on unseen poses and views. Our experiments show that UV-Attack achieves a 92.7% attack success rate against the FastRCNN model across varied poses in dynamic video settings, significantly outperforming the state-of-the-art AdvCamou attack, which only had a 28.5% ASR. Moreover, we achieve 49.5% ASR on the latest YOLOv8 detector in black-box settings. This work highlights the potential of dynamic NeRF-based UV mapping for creating more effective adversarial attacks on person detectors, addressing key challenges in modeling human movement and texture modification. The code is available at https://github.com/PolyLiYJ/UV-Attack.
翻译:在近期研究中,利用贴片或基于静态三维模型的纹理修改对行人检测器进行对抗攻击,由于人体运动的灵活性,其成功率一直较低。建模由各种动作引起的三维形变一直是一个主要挑战。幸运的是,用于动态人体建模的神经辐射场(NeRF)技术的进步提供了新的可能性。本文提出UV-Attack,这是一种开创性方法,即使在广泛且未见过的动作下也能实现高成功率。我们通过利用基于动态NeRF的UV映射解决了上述挑战。UV-Attack能够生成不同动作和视角下的人体图像,甚至可以通过从SMPL参数空间中采样来创建新动作。虽然动态NeRF模型能够建模人体,但修改衣物纹理具有挑战性,因为它们被嵌入神经网络参数中。为解决此问题,UV-Attack生成UV贴图而非RGB图像,并修改纹理堆栈。这种方法实现了实时纹理编辑,使攻击更具实用性。我们还提出了一种新颖的姿态变换期望损失(EoPT),以提高在未见姿态和视角下的逃避成功率。实验表明,在动态视频设置中,针对FastRCNN模型,UV-Attack在不同姿态下实现了92.7%的攻击成功率,显著优于当前最先进的AdvCamou攻击(其ASR仅为28.5%)。此外,在黑盒设置下,我们在最新的YOLOv8检测器上实现了49.5%的ASR。这项工作凸显了基于动态NeRF的UV映射在创建更有效的行人检测器对抗攻击方面的潜力,解决了人体运动建模和纹理修改中的关键挑战。代码可在https://github.com/PolyLiYJ/UV-Attack获取。