Language Models (LMs) are widely used in software engineering for code generation, but they may produce erroneous code. Rather than repairing outputs, a more thorough remedy is to address underlying model failures. LM repair offers a lightweight solution: it requires minimal data, lowers computational cost, and limits side effects. Unlike full retraining, LM repair focuses on applying tailored updates to targeted neurons, making it suitable for limited resources, high-performance demands, or strict safety requirements. In this paper, we propose Semantic Targeting for Analytical Repair (STAR), a novel semantic-based optimization method for repairing LLMs. STAR realizes the main operations of repairing LMs in an optimization process, including locating ``buggy neurons'', solving ``neuron patches'', and patching ``buggy neurons''. The neuron patches are computed with a solid semantic-based analytical formula, which directly bridges the changes to logits with the deltas of neurons, by steering latent representations. Compared to the prior work of LM repair (MINT) and standard optimization methods (SGD), STAR integrates their strengths while mitigating their limitations. By reformulating LM repair as an optimization process, STAR may solve multiple failures together, significantly improving the usefulness. Evaluated on coding tasks using popular code LMs, STAR demonstrates superior effectiveness compared with the state-of-the-art. Besides, STAR exhibits better efficiency. In terms of side effects, namely the balance between generalization and specificity, STAR outperforms prior work by a significant margin. Additionally, we conducted assessments on the overfitting risk of LM repair as well as the cumulative impact. Further, we analyzed the differences with pipeline-based methods and explained the reason why STAR is better and how it mitigated the common limitations of LM repair.
翻译:语言模型在软件工程领域被广泛应用于代码生成,但其生成的代码可能存在错误。相较于修复输出结果,更彻底的解决方案是解决模型的内在故障。LM修复提供了一种轻量级方法:它需要极少的数据、降低计算成本并限制副作用。与完全重新训练不同,LM修复专注于对特定神经元进行针对性更新,适用于资源有限、高性能需求或严格安全要求的场景。本文提出了一种新颖的基于语义的优化方法——语义靶向分析修复,用于修复大型语言模型。STAR将LM修复的核心操作实现为优化过程,包括定位“故障神经元”、求解“神经元补丁”以及修补“故障神经元”。神经元补丁通过坚实的基于语义的解析公式计算,该方法通过引导潜在表征,直接将逻辑值的变化与神经元的增量相关联。与先前的LM修复工作和标准优化方法相比,STAR整合了它们的优势并弥补了其局限性。通过将LM修复重构为优化过程,STAR能够同时解决多个故障,显著提升了实用性。在编码任务上使用主流代码语言模型进行评估,STAR表现出优于现有技术的有效性。此外,STAR展现出更高的效率。在副作用方面,即泛化性与特异性之间的平衡,STAR显著优于先前工作。我们还评估了LM修复的过拟合风险及累积影响,并分析了与基于流水线方法的差异,解释了STAR更优的原因及其如何缓解LM修复的常见局限性。