Assessing code clone harmfulness: Indicators, factors, and counter measures

Publication Type

Conference Proceeding Article

Publication Date

3-2021

Abstract

Code clones are identical or similar code in software projects. On one hand, developers clone code to achieve higher productivity and thus clones inherently exist; on the other hand, code clones demand extra effort to maintain the consistency between clone instances and may introduce bugs, and thus are often considered harmful for software maintenance and quality. We believe that not all code clones have the same level of harmfulness. A systematic way of assessing the harmfulness level of cloned code would facilitate informed decisions on how to deal with clones. We propose a model for clone harmfulness level assessment with four quantitative indicators that can be extracted from the evolution history of the clones. Specifically, we gather information, such as code clone changes and bug- fixes related to clone divergence and re-synchronization, to find objective evidence that a clone harms the software quality or brings potential risks even if no bugs are found. The assessment model consists of four harmfulness levels of clones determined by the four indicators. We also derive three harmfulness factors from the intrinsic properties of clones that potentially affect the harmfulness of clones. We conduct a large-scale empirical study with five open-source and three industry systems and find that 61.0-84.7% of the clones are not harmful in terms of consistent maintenance overhead. We find evidence in the evolution history that several factors, such as spread of clone instances, number of clone instances, and number of developers, have non-trivial correlation with clone harmfulness levels. We also propose six counter measures for clone harmfulness mitigation based on the observation of the harmfulness factors, and have collected useful feedback from industrial software architects and senior developers through an interview meeting.

Keywords

code clone harmfulness, clone analysis, clone evolution, consistent changes

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

2021 28th IEEE International Conference on Software Analysis, Evolution and Reengineering; Virtual, March 9-12: Proceedings

First Page

225

Last Page

236

ISBN

9781728196305

Identifier

10.1109/SANER50967.2021.00029

Publisher

IEEE

City or Country

Piscataway, NJ

Additional URL

https://doi.org/10.1109/SANER50967.2021.00029

Share

COinS