Recent advances in Large Language Models have revolutionized function-level code generation; however, repository-scale Automated Program Repair (APR) remains a significant challenge. Current approaches typically employ a control-centric paradigm, forcing agents to navigate complex directory structures and irrelevant control logic. In this paper, we propose a paradigm shift from the standard Code Property Graphs (CPGs) to the concept of Data Transformation Graph (DTG) that inverts the topology by modeling data states as nodes and functions as edges, enabling agents to trace logic defects through data lineage rather than control flow. We introduce a multi-agent framework that reconciles data integrity navigation with control flow logic. Our theoretical analysis and case studies demonstrate that this approach resolves the "Semantic Trap" inherent in standard RAG systems in modern coding agents. We provide a comprehensive implementation in the form of Autonomous Issue Resolver (AIR), a self-improvement system for zero-touch code maintenance that utilizes neuro-symbolic reasoning and uses the DTG structure for scalable logic repair. Our approach has demonstrated good results on several SWE benchmarks, reaching a resolution rate of 87.1% on SWE-Verified benchmark. Our approach directly addresses the core limitations of current AI code-assistant tools and tackles the critical need for a more robust foundation for our increasingly software-dependent world.
翻译:近年来,大型语言模型的进展已彻底改变了函数级代码生成;然而,仓库规模的自动化程序修复(APR)仍然是一个重大挑战。当前方法通常采用以控制为中心的范式,迫使智能体在复杂的目录结构和不相关的控制逻辑中导航。本文提出一种范式转变,从标准的代码属性图(CPGs)转向数据转换图(DTG)的概念,该概念通过将数据状态建模为节点、函数建模为边来反转拓扑结构,使智能体能够通过数据谱系而非控制流追踪逻辑缺陷。我们引入了一个多智能体框架,将数据完整性导航与控制流逻辑相协调。我们的理论分析和案例研究表明,该方法解决了现代编码智能体中标准RAG系统固有的“语义陷阱”。我们以自主问题解决器(AIR)的形式提供了一个全面的实现,这是一个用于零接触代码维护的自我改进系统,它利用神经符号推理,并采用DTG结构进行可扩展的逻辑修复。我们的方法在多个软件工程基准测试中取得了良好结果,在SWE-Verified基准上达到了87.1%的解决率。该方法直接解决了当前AI代码辅助工具的核心局限性,并应对了我们日益依赖软件的世界对更稳健基础的迫切需求。