Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

5-2023

Abstract

Smart contracts in blockchains have been increasingly used for high-value business applications. It is essential to check smart contracts' reliability before and after deployment. Although various program analysis and deep learning techniques have been proposed to detect vulnerabilities in either Ethereum smart contract source code or bytecode, their detection accuracy and scalability are still limited. This paper presents a novel framework named MANDO-HGT for detecting smart contract vulnerabilities. Given Ethereum smart contracts, either in source code or bytecode form, and vulnerable or clean, MANDO-HGT custom-builds heterogeneous contract graphs (HCGs) to represent control-flow and/or function-call information of the code. It then adapts heterogeneous graph transformers (HGTs) with customized meta relations for graph nodes and edges to learn their embeddings and train classifiers for detecting various vulnerability types in the nodes and graphs of the contracts more accurately. We have collected more than 55K Ethereum smart contracts from various data sources and verified the labels for 423 buggy and 2,742 clean contracts to evaluate MANDO-HGT. Our empirical results show that MANDO-HGT can significantly improve the detection accuracy of other state-of-the-art vulnerability detection techniques that are based on either machine learning or conventional analysis techniques. The accuracy improvements in terms of F1-score range from 0.7% to more than 76% at either the coarse-grained contract level or the fine-grained line level for various vulnerability types in either source code or bytecode. Our method is general and can be retrained easily for different vulnerability types without the need for manually defined vulnerability patterns.

Keywords

Bytecode, Graph transformer, Heterogeneous graph learning, Smart contracts, Source code, Vulnerability detection

Discipline

Databases and Information Systems | Graphics and Human Computer Interfaces

Research Areas

Software and Cyber-Physical Systems

Publication

Proceedings of the 20th IEEE/ACM International Conference on Mining Software Repositories, Melbourne, Australia, 2023 May 15-16

First Page

334

Last Page

346

Identifier

10.1109/MSR59073.2023.00052

Publisher

IEEE

City or Country

New York

Additional URL

https://doi.org/10.1109/MSR59073.2023.00052

Share

COinS