Publication Type

Journal Article

Version

publishedVersion

Publication Date

9-2024

Abstract

With the emergence of smartphones, Android has become a widely used mobile operating system. However, it is vulnerable when encountering various types of attacks. Every day, new malware threatens the security of users' devices and private data. Many methods have been proposed to classify malicious applications, utilizing static or dynamic analysis for classification. However, previous methods still suffer from unsatisfactory performance due to two challenges. First, they are unable to address the imbalanced data distribution problem, leading to poor performance for malware families with few members. Second, they are unable to address the zero-day malware (zero-day malware refers to malicious applications that exploit unknown vulnerabilities) classification problem. In this article, we introduce an innovative meta-learning approach for multi-family Android malware classification named Meta-MAMC, which uses meta-learning technology to learn meta-knowledge (i.e., the similarities and differences among different malware families) of few-family samples and combines new sampling algorithms to solve the above challenges. Meta-MAMC integrates (i) the meta-knowledge contained within the dataset to guide models in learning to identify unknown malware; and (ii) more accurate and diverse tasks based on novel sampling strategies, as well as directly adapting metalearning to a new few-sample and zero-sample task to classify families. We have evaluated Meta-MAMC on two popular datasets and a corpus of real-world Android applications. The results demonstrate its efficacy in accurately classifying malicious applications belonging to certain malware families, even achieving 100% classification in some families.

Keywords

Android, malware family, meta-learning, classification

Discipline

Information Security | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

ACM Transactions on Software Engineering and Methodology

Volume

33

Issue

7

First Page

1

Last Page

27

ISSN

1049-331X

Identifier

10.1145/3664806

Publisher

Association for Computing Machinery (ACM)

Copyright Owner and License

Author-CC-BY

Additional URL

https://doi.org/10.1145/3664806

Share

COinS