Large language models (LLMs) are reshaping automated program repair. We present a unified taxonomy that groups 62 recent LLM-based repair systems into four paradigms defined by parameter adaptation and control authority over the repair loop, and overlays two cross-cutting layers for retrieval and analysis augmentation. Prior surveys have either focused on classical software repair techniques, on LLMs in software engineering more broadly, or on subsets of LLM-based software repair, such as fine-tuning strategies or vulnerability repair. We complement these works by treating fine-tuning, prompting, procedural pipelines, and agentic frameworks as first-class paradigms and systematically mapping representative systems to each of these paradigms. We also consolidate evaluation practice on common benchmarks by recording benchmark scope, pass@k, and fault-localization assumptions to support a more meaningful comparison of reported success rates. We clarify trade-offs among paradigms in task alignment, deployment cost, controllability, and ability to repair multi-hunk or cross-file bugs. We discuss challenges in current LLM-based software repair and outline research directions. Our artifacts, including the representation papers and scripted survey pipeline, are publicly available at https://github.com/GLEAM-Lab/ProgramRepair.
翻译:大语言模型(LLMs)正在重塑自动化程序修复领域。本文提出一个统一的分类体系,将62个近期基于LLM的修复系统归纳为四种范式,这些范式由参数适配方式和修复循环控制权限定义,并叠加了检索与分析增强两个交叉层。现有综述或聚焦于经典软件修复技术,或广泛讨论LLM在软件工程中的应用,或仅涵盖基于LLM的软件修复子集(如微调策略或漏洞修复)。本文通过将微调、提示工程、流程化管道和智能体框架作为一级范式,并将代表性系统系统化映射至各范式,对现有研究形成补充。我们通过记录基准测试范围、pass@k指标及错误定位假设,整合了通用基准上的评估实践,以支持对报告修复成功率进行更有意义的比较。本文阐明了各范式在任务对齐、部署成本、可控性以及修复多块(multi-hunk)或跨文件缺陷能力方面的权衡,讨论了当前基于LLM的软件修复面临的挑战,并展望了研究方向。相关资源(包括代表性论文和可复现的综述流程脚本)已在https://github.com/GLEAM-Lab/ProgramRepair公开。