Research Collection School Of Computing and Information Systems

CLCMiner: Detecting cross-language clones without intermediates

Xiao CHENG, Singapore Management UniversityFollow
Zhiming PENG
Lingxiao JIANG, Singapore Management UniversityFollow
Hao ZHONG
Haibo YU
Jianjun ZHAO

Publication Type

Journal Article

Version

publishedVersion

Publication Date

2-2017

Abstract

The proliferation of diverse kinds of programming languages and platforms makes it a common need to have the same functionality implemented in different languages for different platforms, such as Java for Android applications and C# forWindows phone applications. Although versions of code written in different languages appear syntactically quite different from each other, they are intended to implement the same software and typically contain many code snippets that implement similar functionalities, which we call cross-language clones. When the version of code in one language evolves according to changing functionality requirements and/or bug fixes, its cross-language clones may also need be changed to maintain consistent implementations for the same functionality. Thus, it is needed to have automated ways to locate and track cross-language clones within the evolving software. In the literature, approaches for detecting cross-language clones are only for languages that share a common intermediate language (such as the .NET language family) because they are built on techniques for detecting single-language clones. To extend the capability of cross-language clone detection to more diverse kinds of languages, we propose a novel automated approach, CLCMiner, without the need of an intermediate language. It mines such clones from revision histories, based on our assumption that revisions to different versions of code implemented in different languages may naturally reflect how programmers change cross-language clones in practice, and that similarities among the revisions (referred to as clones in diffs or diff clones) may indicate actual similar code. We have implemented a prototype and applied it to ten open source projects implementations in both Java and C#. The reported clones that occur in revision histories are of high precisions (89% on average) and recalls (95% on average). Compared with token-based code clone detection tools that can treat code as plain texts, our tool can detect significantly more cross-language clones. All the evaluation results demonstrate the feasibility of revision-history based techniques for detecting cross-language clones without intermediates and point to promising future work.

Keywords

cross-language clone, code clone, revision, diff, similarity

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

IEICE Transactions on Information and Systems

Volume

E100-D

Issue

First Page

273

Last Page

284

ISSN

0916-8532

Identifier

10.1587/transinf.2016EDP7334

Publisher

Institute of Electronics, Information and Communication Engineers

Citation

CHENG, Xiao; PENG, Zhiming; JIANG, Lingxiao; ZHONG, Hao; YU, Haibo; and ZHAO, Jianjun. CLCMiner: Detecting cross-language clones without intermediates. (2017). IEICE Transactions on Information and Systems. E100-D, (2), 273-284.
Available at: https://ink.library.smu.edu.sg/sis_research/3644

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://doi.org/10.1587/transinf.2016EDP7334

Download

Find it in your library

Included in

Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

CLCMiner: Detecting cross-language clones without intermediates

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

CLCMiner: Detecting cross-language clones without intermediates

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links