Research Collection School Of Computing and Information Systems

Experimental comparison of features, analyses, and classifiers for Android malware detection

Lwin Khin SHAR, Singapore Management UniversityFollow
Biniam Fisseha DEMISSIE
Mariano CECCATO
Naing Tun YAN, Singapore Management UniversityFollow
David LO, Singapore Management UniversityFollow
Lingxiao JIANG, Singapore Management UniversityFollow
Christoph BIENERT

Publication Type

Journal Article

Version

acceptedVersion

Publication Date

9-2023

Abstract

Android malware detection has been an active area of research. In the past decade, several machine learning-based approaches based on different types of features that may characterize Android malware behaviors have been proposed. The usually-analyzed features include API usages and sequences at various abstraction levels (e.g., class and package), extracted using static or dynamic analysis. Additionally, features that characterize permission uses, native API calls and reflection have also been analyzed. Initial works used conventional classifiers such as Random Forest to learn on those features. In recent years, deep learning-based classifiers such as Recurrent Neural Network have been explored. Considering various types of features, analyses, and classifiers proposed in literature, there is a need of comprehensive evaluation on performances of current state-of-the-art Android malware classification based on a common benchmark. In this study, we evaluate the performance of different types of features and the performance between a conventional classifier, Random Forest (RF) and a deep learning classifier, Recurrent Neural Network (RNN). To avoid temporal and spatial biases, we evaluate the performances in a time- and space-aware setting in which classifiers are trained with older apps and tested on newer apps, and the distribution of test samples is representative of in-the-wild malware-to-benign ratio. Features are extracted from a common benchmark of 7,860 benign samples and 5,912 malware, whose release years span from 2010 to 2020. Among other findings, our study shows that permission use features perform the best among the features we investigated; package-level features generally perform better than class-level features; static features generally perform better than dynamic features; and RNN classifier performs better than RF classifier when trained on sequence-type features.

Keywords

malware detection, machine learning, deep learning, android

Discipline

Information Security | Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

Empirical Software Engineering

Volume

First Page

Last Page

ISSN

1382-3256

Identifier

10.1007/s10664-023-10375-y

Publisher

Springer

Citation

SHAR, Lwin Khin; DEMISSIE, Biniam Fisseha; CECCATO, Mariano; YAN, Naing Tun; LO, David; JIANG, Lingxiao; and BIENERT, Christoph. Experimental comparison of features, analyses, and classifiers for Android malware detection. (2023). Empirical Software Engineering. 28, 1-39.
Available at: https://ink.library.smu.edu.sg/sis_research/8211

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1007/s10664-023-10375-y

Download

Included in

Information Security Commons, Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Experimental comparison of features, analyses, and classifiers for Android malware detection

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Experimental comparison of features, analyses, and classifiers for Android malware detection

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

Volume

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links