Assessing code clone harmfulness: Indicators, factors, and counter measures
Publication Type
Conference Proceeding Article
Publication Date
3-2021
Abstract
Code clones are identical or similar code in software projects. On one hand, developers clone code to achieve higher productivity and thus clones inherently exist; on the other hand, code clones demand extra effort to maintain the consistency between clone instances and may introduce bugs, and thus are often considered harmful for software maintenance and quality. We believe that not all code clones have the same level of harmfulness. A systematic way of assessing the harmfulness level of cloned code would facilitate informed decisions on how to deal with clones. We propose a model for clone harmfulness level assessment with four quantitative indicators that can be extracted from the evolution history of the clones. Specifically, we gather information, such as code clone changes and bug- fixes related to clone divergence and re-synchronization, to find objective evidence that a clone harms the software quality or brings potential risks even if no bugs are found. The assessment model consists of four harmfulness levels of clones determined by the four indicators. We also derive three harmfulness factors from the intrinsic properties of clones that potentially affect the harmfulness of clones. We conduct a large-scale empirical study with five open-source and three industry systems and find that 61.0-84.7% of the clones are not harmful in terms of consistent maintenance overhead. We find evidence in the evolution history that several factors, such as spread of clone instances, number of clone instances, and number of developers, have non-trivial correlation with clone harmfulness levels. We also propose six counter measures for clone harmfulness mitigation based on the observation of the harmfulness factors, and have collected useful feedback from industrial software architects and senior developers through an interview meeting.
Keywords
code clone harmfulness, clone analysis, clone evolution, consistent changes
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
2021 28th IEEE International Conference on Software Analysis, Evolution and Reengineering; Virtual, March 9-12: Proceedings
First Page
225
Last Page
236
ISBN
9781728196305
Identifier
10.1109/SANER50967.2021.00029
Publisher
IEEE
City or Country
Piscataway, NJ
Citation
HU, Bin; WU, Yijian; PENG, Xin; SUN, Jun; ZHAN, Nanjie; and WU, Jun.
Assessing code clone harmfulness: Indicators, factors, and counter measures. (2021). 2021 28th IEEE International Conference on Software Analysis, Evolution and Reengineering; Virtual, March 9-12: Proceedings. 225-236.
Available at: https://ink.library.smu.edu.sg/sis_research/6193
Additional URL
https://doi.org/10.1109/SANER50967.2021.00029