Large Language Models (LLMs) are increasingly trusted to perform automated code review and static analysis at scale, supporting tasks such as vulnerability detection, summarization, and refactoring. In this paper, we identify and exploit a critical vulnerability in LLM-based code analysis: an abstraction bias that causes models to overgeneralize familiar programming patterns and overlook small, meaningful bugs. Adversaries can exploit this blind spot to hijack the control flow of the LLM's interpretation with minimal edits and without affecting actual runtime behavior. We refer to this attack as a Familiar Pattern Attack (FPA). We develop a fully automated, black-box algorithm that discovers and injects FPAs into target code. Our evaluation shows that FPAs are not only effective against basic and reasoning models, but are also transferable across model families (OpenAI, Anthropic, Google), and universal across programming languages (Python, C, Rust, Go). Moreover, FPAs remain effective even when models are explicitly warned about the attack via robust system prompts. Finally, we explore positive, defensive uses of FPAs and discuss their broader implications for the reliability and safety of code-oriented LLMs.
翻译:大型语言模型(LLMs)正日益被信任用于执行大规模自动化代码审查和静态分析,支持漏洞检测、代码摘要和重构等任务。本文中,我们识别并利用基于LLM的代码分析中的一个关键漏洞:一种抽象偏见,导致模型过度泛化熟悉的编程模式并忽略微小但具有实际意义的缺陷。攻击者可以利用这一盲点,通过极少的编辑且不影响实际运行时行为,劫持LLM解释的控制流。我们将此类攻击称为熟悉模式攻击(FPA)。我们开发了一种全自动的黑盒算法,用于发现并向目标代码注入FPA。评估结果表明,FPA不仅对基础模型和推理模型有效,而且可跨模型家族(OpenAI、Anthropic、Google)迁移,并适用于多种编程语言(Python、C、Rust、Go)。此外,即使通过鲁棒的系统提示明确警告模型有关攻击的存在,FPA仍然有效。最后,我们探讨了FPA的积极防御用途,并讨论了其对面向代码的LLM可靠性和安全性的更广泛影响。