Publication Type
Journal Article
Version
acceptedVersion
Publication Date
9-2025
Abstract
With the rapid development of blockchain technology, the widespread adoption of smart contracts—particularly in decentralized finance (DeFi) applications—has introduced significant security challenges, such as reentrancy attacks, phishing, and Sybil attacks. To address these issues, we propose a novel model called TrxGNNBERT, which combines Graph Neural Network (GNN) and the Transformer architecture to effectively handle both graph-structured and textual data. This combination enhances the detection of suspicious transactions and accounts on blockchain platforms like Ethereum. TrxGNNBERT was pre-trained using a masked language model (MLM) on a dataset of 60,000 Ethereum transactions by randomly masking the attributes of nodes and edges, thereby capturing deep semantic relationships and structural information. In this work, we constructed transaction subgraphs, using a GNN module to enrich the embedding representations, which were then fed into a Transformer encoder. The experimental results demonstrate that TrxGNNBERT outperforms various baseline models—including DeepWalk, Trans2Vec, Role2Vec, GCN, GAT, GraphSAGE, CodeBERT, GraphCodeBERT, Zipzap and BERT4ETH—in detecting suspicious transactions and accounts. Specifically, TrxGNNBERT achieved an accuracy of 0.755 and an F1 score of 0.756 on the TrxLarge dataset; an accuracy of 0.903 and an F1 score of 0.894 on the TrxSmall dataset; and an accuracy of 0.790 and an F1 score of 0.781 on the AddrDec dataset. We also explored different pre-training configurations and strategies, comparing the performance of encoder-based versus decoder-based Transformer structures. The results indicate that pre-training improves downstream task performance, with encoder-based structures outperforming decoder-based ones. Through ablation studies, we found that node-level information and subgraph structures are critical for achieving optimal performance in transaction classification tasks. When key features were removed, the model performance declined considerably, demonstrating the importance of each component of our method. These findings offer valuable insights for future research, suggesting further improvements in node attribute representation and subgraph extraction.
Keywords
Smart contract transaction, pre-trained model, blockchain security
Discipline
Artificial Intelligence and Robotics | Programming Languages and Compilers
Areas of Excellence
Digital transformation
Publication
IEEE Transactions on Information Forensics and Security
Volume
20
First Page
10051
Last Page
10065
ISSN
1556-6013
Identifier
10.1109/TIFS.2025.3612184
Publisher
Institute of Electrical and Electronics Engineers
Citation
MA, Wei; SHI, Junjie; QIU, Jiaxi; WU, Cong; CHEN, Jing; JIANG, Lingxiao; LIU, Shangqing; LIU, Yang; and XIANG, Yang.
Detecting DeFi fraud with a graph-transformer language model. (2025). IEEE Transactions on Information Forensics and Security. 20, 10051-10065.
Available at: https://ink.library.smu.edu.sg/sis_research/10578
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TIFS.2025.3612184
Included in
Artificial Intelligence and Robotics Commons, Programming Languages and Compilers Commons