Modern code generation has made significant strides in functional correctness and execution efficiency. However, these systems often overlook a critical dimension in real-world software development: maintainability. To handle dynamic requirements with minimal rework, we propose MaintainCoder as a pioneering solution. It integrates the Waterfall model, design patterns, and multi-agent collaboration to systematically enhance cohesion, reduce coupling, achieving clear responsibility boundaries and better maintainability. We also introduce MaintainCoder, a benchmark comprising requirement changes and novel dynamic metrics on maintenance efforts. Experiments demonstrate that existing code generation methods struggle to meet maintainability standards when requirements evolve. In contrast, MaintainCoder improves dynamic maintainability metrics by more than 60% with even higher correctness of initial codes. Furthermore, while static metrics fail to accurately reflect maintainability and even contradict each other, our proposed dynamic metrics exhibit high consistency. Our work not only provides the foundation for maintainable code generation, but also highlights the need for more realistic and comprehensive code generation research. Resources: https://github.com/IAAR-Shanghai/MaintainCoder.
翻译:现代代码生成系统在功能正确性和执行效率方面已取得显著进展。然而,这些系统往往忽视了实际软件开发中的一个关键维度:可维护性。为以最小返工量应对动态需求,我们提出MaintainCoder作为开创性解决方案。该方法融合瀑布模型、设计模式与多智能体协作机制,系统性地增强内聚性、降低耦合度,实现清晰的职责边界与更优的可维护性。我们还构建了MaintainCoder基准数据集,包含需求变更场景及创新的动态维护工作量评估指标。实验表明,现有代码生成方法在需求演进时难以满足可维护性标准。相比之下,MaintainCoder在保持更高初始代码正确率的同时,将动态可维护性指标提升超过60%。此外,传统静态指标不仅无法准确反映可维护性,甚至出现相互矛盾的情况,而我们提出的动态指标则表现出高度一致性。本研究不仅为可维护代码生成奠定基础,更凸显了开展更贴近现实、更全面的代码生成研究的必要性。资源地址:https://github.com/IAAR-Shanghai/MaintainCoder。