Distributed learning frameworks, which partition neural network models across multiple computing nodes, enhance efficiency in collaborative edge-cloud systems, but may also introduce new vulnerabilities to evasion attacks, often in the form of adversarial perturbations. In this work, we present a new threat model that explores the feasibility of generating universal adversarial perturbations (UAPs) when the attacker has access only to the edge portion of the model, consisting of its initial network layers. Unlike traditional attacks that require full model knowledge, our approach shows that adversaries can induce effective mispredictions in the unknown cloud component by manipulating key feature representations at the edge. Following the proposed threat model, we introduce both edge-only untargeted and targeted formulations of UAPs designed to control intermediate features before the split point. Our results on ImageNet demonstrate strong attack transferability to the unknown cloud part, and we compare the proposed method with classical white-box and black-box techniques, highlighting its effectiveness. Additionally, we analyze the capability of an attacker to achieve targeted adversarial effects with edge-only knowledge, revealing intriguing behaviors across multiple networks. By introducing the first adversarial attacks with edge-only knowledge in split inference, this work underscores the importance of addressing partial model access in adversarial robustness, encouraging further research in this area.
翻译:分布式学习框架通过将神经网络模型分割至多个计算节点,提升了协作式边缘-云系统的效率,但也可能引入新的规避攻击漏洞,通常以对抗性扰动的形式呈现。本文提出一种新的威胁模型,探讨当攻击者仅能访问模型边缘部分(即初始网络层)时生成通用对抗扰动(UAPs)的可行性。与传统需要完整模型知识的攻击不同,我们的方法表明,攻击者通过操控边缘的关键特征表示,可在未知的云端组件中引发有效的错误预测。基于该威胁模型,我们提出了边缘专用的无目标与有目标UAPs构建方法,旨在控制分割点前的中间特征。在ImageNet上的实验结果表明,该方法对未知云端部分具有强大的攻击可迁移性,我们将其与经典白盒及黑盒技术进行对比,突显了其有效性。此外,我们分析了攻击者仅凭边缘知识实现有目标对抗效果的能力,揭示了跨多个网络的有趣行为。作为首个在分割推理中基于边缘知识的对抗攻击研究,本文强调了在对抗鲁棒性中处理部分模型访问的重要性,并鼓励该领域的进一步探索。