Research Collection School Of Computing and Information Systems

AutoPruner: transformer-based call graph pruning

Cong Thanh LE, Singapore Management UniversityFollow
Hong Jin KANG, Singapore Management UniversityFollow
Truong Giang NGUYEN, Singapore Management UniversityFollow
Stefanus AGUS HARYONO, Singapore Management UniversityFollow
David LO, Singapore Management UniversityFollow
Xuan-Bach D. LE
Huynh Quyet THANG

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2022

Abstract

Constructing a static call graph requires trade-offs between soundness and precision. Program analysis techniques for constructing call graphs are unfortunately usually imprecise. To address this problem, researchers have recently proposed call graph pruning empowered by machine learning to post-process call graphs constructed by static analysis. A machine learning model is built to capture information from the call graph by extracting structural features for use in a random forest classifier. It then removes edges that are predicted to be false positives. Despite the improvements shown by machine learning models, they are still limited as they do not consider the source code semantics and thus often are not able to effectively distinguish true and false positives.In this paper, we present a novel call graph pruning technique, AutoPruner, for eliminating false positives in call graphs via both statistical semantic and structural analysis. Given a call graph constructed by traditional static analysis tools, AutoPruner takes a Transformer-based approach to capture the semantic relationships between the caller and callee functions associated with each edge in the call graph. To do so, AutoPruner fine-tunes a model of code that was pre-trained on a large corpus to represent source code based on descriptions of its semantics. Next, the model is used to extract semantic features from the functions related to each edge in the call graph. AutoPruner uses these semantic features together with the structural features extracted from the call graph to classify each edge via a feed-forward neural network. Our empirical evaluation on a benchmark dataset of real-world programs shows that AutoPruner outperforms the state-of-the-art baselines, improving on F-measure by up to 13% in identifying false-positive edges in a static call graph. Moreover, AutoPruner achieves improvements on two client analyses, including halving the false alarm rate on null pointer analysis and over 10% improvements on monomorphic call-site detection. Additionally, our ablation study and qualitative analysis show that the semantic features extracted by AutoPruner capture a remarkable amount of information for distinguishing between true and false positives.

Keywords

Call graph pruning, Static analysis, Pretrained language model, Transformer

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore, 2022 November 14-18

First Page

520

Last Page

532

ISBN

9781450394130

Identifier

10.1145/3540250.3549175

Publisher

Association for Computing Machinery

City or Country

New York

Citation

LE, Cong Thanh; KANG, Hong Jin; NGUYEN, Truong Giang; AGUS HARYONO, Stefanus; LO, David; LE, Xuan-Bach D.; and THANG, Huynh Quyet. AutoPruner: transformer-based call graph pruning. (2022). Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore, 2022 November 14-18. 520-532.
Available at: https://ink.library.smu.edu.sg/sis_research/7740

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3540250.3549175

Download

Included in

Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

AutoPruner: transformer-based call graph pruning

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

AutoPruner: transformer-based call graph pruning

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links