Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

5-2009

Abstract

Similar code may exist in large software projects due to some common software engineering practices, such as copying and pasting code and n-version programming. Although previous work has studied syntactic equivalence and small-scale, coarse-grained program-level and function-level semantic equivalence, it is not known whether significant fine-grained, code-level semantic duplications exist. Detecting such semantic equivalence is also desirable because it can enable many applications such as code understanding, maintenance, and optimization. In this paper, we introduce the first algorithm to automatically mine functionally equivalent code fragments of arbitrary size - down to an executable statement. Our notion of functional equivalence is based on input and output behavior. Inspired by Schwartz's randomized polynomial identity testing, we develop our core algorithm using automated random testing: (1) candidate code fragments are automatically extracted from the input program; and (2) random inputs are generated to partition the code fragments based on their output values on the generated inputs. We implemented the algorithm and conducted a large-scale empirical evaluation of it on the Linux kernel 2.6.24. Our results show that there exist many functionally equivalent code fragments that are syntactically different (i.e., they are unlikely due to copying and pasting code). The algorithm also scales to million-line programs; it was able to analyze the Linux kernel with several days of parallel processing.

Keywords

code clones, random testing, functional equivalence

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

ISSTA '09: Proceedings of the 2009 International Symposium on Software Testing and Analysis, Chicago, July 19-23, 2009

First Page

Last Page

ISBN

9781605583389

Identifier

10.1145/1572272.1572283

Publisher

ACM

City or Country

New York

Citation

JIANG, Lingxiao and SU, Zhendong. Automatic mining of functionally equivalent code fragments via random testing. (2009). ISSTA '09: Proceedings of the 2009 International Symposium on Software Testing and Analysis, Chicago, July 19-23, 2009. 81-92.
Available at: https://ink.library.smu.edu.sg/sis_research/954

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://doi.org/10.1145/1572272.1572283

Download

Find it in your library

Included in

Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Automatic mining of functionally equivalent code fragments via random testing

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Automatic mining of functionally equivalent code fragments via random testing

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links