Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

2-2020

Abstract

Binary diffing analysis quantitatively measures the differences between two given binaries and produces fine-grained basic block matching. It has been widely used to enable different kinds of critical security analysis. However, all existing program analysis and machine learning based techniques suffer from low accuracy, poor scalability, coarse granularity, or require extensive labeled training data to function. In this paper, we propose an unsupervised program-wide code representation learning technique to solve the problem. We rely on both the code semantic information and the program-wide control flow information to generate block embeddings. Furthermore, we propose a k-hop greedy matching algorithm to find the optimal diffing results using the generated block embeddings. We implement a prototype called DeepBinDiff and evaluate its effectiveness and efficiency with large number of binaries. The results show that our tool could outperform the state-of-the-art binary diffing tools by a large margin for both cross-version and cross-optimization level diffing. A case study for OpenSSL using real-world vulnerabilities further demonstrates the usefulness of our system.

Discipline

Information Security

Research Areas

Cybersecurity; Information Systems and Management

Publication

Proceedings of the Network and Distributed System Security Symposium, California, USA, 2020 February 23-26

Identifier

10.14722/ndss.2020.24311

City or Country

Citation

DUAN, Yue; LI, Xuezixiang; WANG, Jinghan; Wang; and YIN, Heng. Deepbindiff: Learning program-wide code representations for binary diffing. (2020). Proceedings of the Network and Distributed System Security Symposium, California, USA, 2020 February 23-26.
Available at: https://ink.library.smu.edu.sg/sis_research/8168

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://doi.org/10.14722/ndss.2020.24311

Download

Included in

Information Security Commons

COinS

Research Collection School Of Computing and Information Systems

Deepbindiff: Learning program-wide code representations for binary diffing

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

Identifier

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Deepbindiff: Learning program-wide code representations for binary diffing

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

Identifier

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links