Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
5-2021
Abstract
Abstract syntax tree (AST) mapping algorithms are widely used to analyze changes in source code. Despite the foundational role of AST mapping algorithms, little effort has been made to evaluate the accuracy of AST mapping algorithms, i.e., the extent to which an algorithm captures the evolution of code. We observe that a program element often has only one best-mapped program element. Based on this observation, we propose a hierarchical approach to automatically compare the similarity of mapped statements and tokens by different algorithms. By performing the comparison, we determine if eachof the compared algorithms generates inaccurate mappings for a statement or its tokens. We invite 12 external experts to determine if three commonly used AST mapping algorithms generate accurate mappings for a statement and its tokens for 200 statements. Based on the experts’ feedback, we observe that our approach achieves a precision of 0.98-1.00 and a recall of 0.65-0.75. Furthermore, we conduct a large-scale study with a dataset of ten Java projects containing a total of 263,165 file revisions. Our approach determines that GumTree, MTDiff and IJM generate inaccurate mappings for 20%-29%, 25%-36% and 21%-30% of the file revisions, respectively. Our experimental results show that state-of-the-art AST mapping algorithms still need improvements.
Keywords
abstract syntax tree, program element mapping, software evolution
Discipline
Artificial Intelligence and Robotics | Databases and Information Systems
Research Areas
Data Science and Engineering; Intelligent Systems and Optimization
Publication
43rd IEEE/ACM International Conference on Software Engineering (ICSE 2021)
First Page
1174
Last Page
1185
ISBN
9781665402965
Identifier
10.1109/ICSE43902.2021.00108
Publisher
IEEE
City or Country
Madrid, Spain
Citation
FAN, Yuanrui; XIA, Xin; LO, David; HASSAN, Ahmed E.; WANG, Yuan; and LI, Shanping.
A differential testing approach for evaluating abstract syntax tree mapping algorithms. (2021). 43rd IEEE/ACM International Conference on Software Engineering (ICSE 2021). 1174-1185.
Available at: https://ink.library.smu.edu.sg/sis_research/6879
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.