Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
4-2022
Abstract
Function signature recovery is important for binary analysis and security enhancement, such as bug finding and control-flow integrity enforcement. However, binary executables typically have crucial information vital for function signature recovery stripped off during compilation. To make things worse, recent studies show that many compiler optimization strategies further complicate the recovery of function signatures with intended violations to function calling conventions.In this paper, we first perform a systematic study to quantify the extent to which compiler optimizations (negatively) impact the accuracy of existing deep learning techniques for function signature recovery. Our experiments show that a state-of-the-art deep learning technique has its accuracy dropped from 98.7% to 87.7% when training and testing optimized binaries. We further identify specific weaknesses in existing approaches and propose an enhanced deep learning approach named \sysname (\underlineRe vivifying Function \underlineS ignature \underlineI nference using Deep \underlineL earning) to incorporate compiler-optimization-specific domain knowledge into the learning process. Our experimental results show that \sysname significantly improves the accuracy and F1 score in inferring function signatures, e.g., with accuracy in inferring the number of arguments for callees compiled with optimization flag O1 from 84.8% to 92.67%. We also demonstrate security implications of \sysname in Control-Flow Integrity enforcement in stopping potential Counterfeit Object-Oriented Programming (COOP) attacks.
Keywords
Function Signature, Recurrent Neural Network, Compiler Optimization
Discipline
Information Security | OS and Networks
Research Areas
Information Systems and Management
Publication
Proceedings of the 12th ACM Conference on Data and Application Security and Privacy, Baltimore, USA, 2022 April 24-27
First Page
107
Last Page
118
ISBN
9781450392204
Identifier
10.1145/3508398.3511502
Publisher
ACM
City or Country
Baltimore, USA
Citation
LIN, Yan; GAO, Debin; and LO, David.
ReSIL: Revivifying function signature inference using deep learning with domain-specific knowledge. (2022). Proceedings of the 12th ACM Conference on Data and Application Security and Privacy, Baltimore, USA, 2022 April 24-27. 107-118.
Available at: https://ink.library.smu.edu.sg/sis_research/7355
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3508398.3511502