Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

3-2024

Abstract

Pseudocode diffing precisely locates similar parts and captures differences between the decompiled pseudocode of two given binaries. It is particularly useful in many security scenarios such as code plagiarism detection, lineage analysis, patch, vulnerability analysis, etc. However, existing pseudocode diffing and binary diffing tools suffer from low accuracy and poor scalability, since they either rely on manually-designed heuristics (e.g., Diaphora) or heavy computations like matrix factorization (e.g., DeepBinDiff). To address the limitations, in this paper, we propose a semantics-aware, deep neural network-based model called SIGMADIFF. SIGMADIFF first constructs IR (Intermediate Representation) level interprocedural program dependency graphs (IPDGs). Then it uses a lightweight symbolic analysis to extract initial node features and locate training nodes for the neural network model. SIGMADIFF then leverages the stateof-the-art graph matching model called Deep Graph Matching Consensus (DGMC) to match the nodes in IPDGs. SIGMADIFF also introduces several important updates to the design of DGMC such as the pre-training and fine-tuning schema. Experimental results show that SIGMADIFF significantly outperforms the stateof-the-art heuristic-based and deep learning-based techniques in terms of both accuracy and efficiency. It is able to precisely pinpoint eight vulnerabilities in a widely-used video conferencing application.

Discipline

Information Security | OS and Networks

Research Areas

Information Systems and Management

Publication

Proceedings of the 31st Network and Distributed System Security Symposium (NDSS 2024), San Diego, CA, USA, February 26 - March 1

First Page

Last Page

ISBN

1-891562-93-2

Identifier

10.14722/ndss.2024.23208

Publisher

Internet Society

City or Country

Citation

GAO, Lian; QU, Yu; YU, Sheng; DUAN, Yue; and YIN, Heng. SigmaDiff: Semantics-aware deep graph matching for pseudocode diffing. (2024). Proceedings of the 31st Network and Distributed System Security Symposium (NDSS 2024), San Diego, CA, USA, February 26 - March 1. 1-19.
Available at: https://ink.library.smu.edu.sg/sis_research/8668

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.14722/ndss.2024.23208

Download

Included in

Information Security Commons, OS and Networks Commons

COinS

Research Collection School Of Computing and Information Systems

SigmaDiff: Semantics-aware deep graph matching for pseudocode diffing

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

SigmaDiff: Semantics-aware deep graph matching for pseudocode diffing

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links