Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

5-2023

Abstract

Code review is an effective software quality assurance activity; however, it is labor-intensive and time-consuming. Thus, a number of generation-based automatic code review (ACR) approaches have been proposed recently, which leverage deep learning techniques to automate various activities in the code review process (e.g., code revision generation and review comment generation).We find the previous works carry three main limitations. First, the ACR approaches have been shown to be beneficial in each work, but those methods are not comprehensively compared with each other to show their superiority over their peer ACR approaches. Second, general-purpose pre-trained models such as CodeT5 are proven to be effective in a wide range of Software Engineering (SE) tasks. However, no prior work has investigated the effectiveness of these models in ACR tasks yet. Third, prior works heavily rely on the Exact Match (EM) metric which only focuses on the perfect predictions and ignores the positive progress made by incomplete answers. To fill such a research gap, we conduct a comprehensive study by comparing the effectiveness of recent ACR tools as well as the general-purpose pre-trained models. The results show that a general-purpose pre-trained model CodeT5 can outperform other models in most cases. Specifically, CodeT5 outperforms the prior state-of-the-art by 13.4%-38.9% in two code revision generation tasks. In addition, we introduce a new metric namely Edit Progress (EP) to quantify the partial progress made by ACR tools. The results show that the rankings of models for each task could be changed according to whether EM or EP is being utilized. Lastly, we derive several insightful lessons from the experimental results and reveal future research directions for generation-based code review automation.

Keywords

Automatic codes, Code review, Engineering tasks, Labor time, Labour-intensive, Learning techniques, Research gaps, Review process, Software quality assurance, State of the art

Discipline

Databases and Information Systems | Software Engineering

Research Areas

Data Science and Engineering

Publication

Proceedings of the 31st IEEE/ACM International Conference on Program Comprehension, Melbourne, Australia, 2023 May 15-16

First Page

215

Last Page

226

ISBN

9798350337501

Identifier

10.1109/ICPC58990.2023.00036

Publisher

IEEE

City or Country

New Jersey

Citation

ZHOU, Xin; KIM, Kisub; XU, Bowen; HAN, DongGyun; HE, Junda; and LO, David. Generation-based code review automation: How far are we?. (2023). Proceedings of the 31st IEEE/ACM International Conference on Program Comprehension, Melbourne, Australia, 2023 May 15-16. 215-226.
Available at: https://ink.library.smu.edu.sg/sis_research/8567

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1109/ICPC58990.2023.00036

Download

Included in

Databases and Information Systems Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Generation-based code review automation: How far are we?

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Generation-based code review automation: How far are we?

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links