DeepRefiner: Multi-layer android malware detection system applying deep neural networks
Abstract
As malicious behaviors vary significantly across mobile malware, it is challenging to detect malware both efficiently and effectively. Also due to the continuous evolution of malicious behaviors, it is difficult to extract features by laborious human feature engineering and keep up with the speed of malware evolution. To solve these challenges, we propose DeepRefiner to identify malware both efficiently and effectively. The novel technique enabling effectiveness is the semantic-based deep learning. We use Long Short Term Memory on the semantic structure of Android bytecode, avoiding missing the details of method-level bytecode semantics. To achieve efficiency, we apply Multilayer Perceptron on the xml files based on the finding that most malware can be efficiently identified using information only from xml files. We evaluate the detection performance of DeepRefiner with 62,915 malicious applications and 47,525 benign applications, showing that DeepRefiner effectively detects malware with an accuracy of 97.74% and a false positive rate of 2.54%. We compare DeepRefiner with a state-of-the-art single classifierbased detection system, StormDroid, and ten widely used signature-based anti-virus scanners. The experimental results show that DeepRefiner significantly outperforms StormDroid and anti-virus scanners. In addition, we evaluate the robustness of DeepRefiner against typical obfuscation techniques and adversarial samples. The experimental results demonstrate that DeepRefiner is robust in detecting obfuscated malicious applications.