Research Collection School Of Computing and Information Systems

Evaluating software development agents: Patch patterns, code quality, and issue complexity in real-world GitHub scenarios

Zhi CHEN, Singapore Management UniversityFollow
Lingxiao JIANG, Singapore Management UniversityFollow

Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

3-2025

Abstract

In recent years, AI-based software engineering has progressed from pre-trained models to advanced agentic workflows, with Software Development Agents representing the next major leap. These agents, capable of reasoning, planning, and interacting with external environments, offer promising solutions to complex software engineering tasks. However, while much research has evaluated code generated by large language models (LLMs), comprehensive studies on agent-generated patches, particularly in real-world settings, are lacking. This study addresses that gap by evaluating 4,892 patches from 10 top-ranked agents on 500 real-world GitHub issues from SWE-Bench Verified, focusing on their impact on code quality. Our analysis shows no single agent dominated, with 170 issues unresolved, indicating room for improvement. Even for patches that passed unit tests and resolved issues, agents made different file and function modifications compared to the gold patches from repository developers, revealing limitations in the benchmark's test case coverage. Most agents maintained code reliability and security, avoiding new bugs or vulnerabilities; while some agents increased code complexity, many reduced code duplication and minimized code smells. Finally, agents performed better on simpler codebases, suggesting that breaking complex tasks into smaller sub-tasks could improve effectiveness. This study provides the first comprehensive evaluation of agent-generated patches on real-world GitHub issues, offering insights to advance AI-driven software development.

Keywords

Software Development Agents, Patch Generation, Large Language Models, Code Quality, GitHub Issues

Discipline

Artificial Intelligence and Robotics | Software Engineering

Areas of Excellence

Digital transformation

Publication

Proceedings of the 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Montreal, Canada, March 4-7

First Page

657

Last Page

668

Identifier

10.1109/SANER64311.2025.00068

Publisher

IEEE Computer Society

City or Country

Los Alamitos, CA

Citation

CHEN, Zhi and JIANG, Lingxiao. Evaluating software development agents: Patch patterns, code quality, and issue complexity in real-world GitHub scenarios. (2025). Proceedings of the 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Montreal, Canada, March 4-7. 657-668.
Available at: https://ink.library.smu.edu.sg/sis_research/10767

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/SANER64311.2025.00068

Download

Included in

Artificial Intelligence and Robotics Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Evaluating software development agents: Patch patterns, code quality, and issue complexity in real-world GitHub scenarios

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Areas of Excellence

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Evaluating software development agents: Patch patterns, code quality, and issue complexity in real-world GitHub scenarios

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Areas of Excellence

Publication

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links