Beyond decision: Android malware description generation through profiling malicious behavior trajectory

Publication Type

Journal Article

Publication Date

8-2025

Abstract

Malware family labels and key features used for the decision-making of Android malware detection models fall short of precise comprehension of malicious behaviors due to their coarse granularity. To solve these problems, in this article, we first introduce the concept of the malicious behavior trajectory (MBT) and propose an innovative approach called ProMal. ProMal aims to automatically generate malware descriptions with fine granularity through extracted MBTs from malware for users. Specifically, a labeled dataset of MBTs is constructed through substantial human efforts to build a behavioral knowledge graph (BxKG). The BxKG is scalable and can be automatically updated using two strategies to ensure its completeness and timeliness: (1) taking into consideration the evolution of Android SDKs and (2) mining new MBTs by leveraging the widely-used malware datasets. We highlight that the knowledge graph is essential in ProMal, which can reason new MBTs based on existing MBTs because of its structured data representation and semantic relation modeling, and thus helps effectively extract real MBTs in Android malware. We evaluated ProMal on a recent malware dataset where researcher-crafted malware descriptions are available, and the Precision, Recall, and F1-Score of MBT identification based on BxKG reached 96.97\%, 91.43\%, and 0.94, respectively, outperforming the state-of-the-art approaches. Taking MBTs identified from Android malware as inputs, precise, fine-grained, and human-readable descriptions can be generated using the large language model, whose readability and usability are verified through a user study. The generated descriptions play a significant role in interpreting and comprehending malware behaviors.

Discipline

Software Engineering

Research Areas

Intelligent Systems and Optimization

Publication

ACM Transactions on Software Engineering and Methodology

Volume

34

Issue

7

First Page

1

Last Page

39

ISSN

1049-331X

Identifier

10.1145/3715909

Publisher

Association for Computing Machinery (ACM)

Additional URL

https://doi.org/10.1145/3715909

This document is currently not available here.

Share

COinS