Abstract
The rapid increase in the number of Android malware poses great challenges to anti-malware systems, because the sheer number of malware samples overwhelms malware analysis systems. The classification of malware samples into families, such that the common features shared by malware samples in the same family can be exploited in malware detection and inspection, is a promising approach for accelerating malware analysis. Furthermore, the selection of representative malware samples in each family can drastically decrease the number of malware to be analyzed. However, the existing classification solutions are limited because of the following reasons. First, the legitimate part of the malware may misguide the classification algorithms because the majority of Android malware are constructed by inserting malicious components into popular apps. Second, the polymorphic variants of Android malware can evade detection by employing transformation attacks. In this paper, we propose a novel approach that constructs frequent subgraphs (fregraphs) to represent the common behaviors of malware samples that belong to the same family. Moreover, we propose and develop FalDroid, a novel system that automatically classifies Android malware and selects representative malware samples in accordance with fregraphs. We apply it to 8407 malware samples from 36 families. Experimental results show that FalDroid can correctly classify 94.2% of malware samples into their families using approximately 4.6 sec per app. FalDroid can also dramatically reduce the cost of malware investigation by selecting only 8.5% to 22% representative samples that exhibit the most common malicious behavior among all samples.
Original language | English |
---|---|
Pages (from-to) | 1890-1905 |
Number of pages | 16 |
Journal | IEEE Transactions on Information Forensics and Security |
Volume | 13 |
Issue number | 8 |
DOIs | |
Publication status | Published - Aug 2018 |
Keywords
- Android malware
- Familial classification
- Frequent subgraph
ASJC Scopus subject areas
- Safety, Risk, Reliability and Quality
- Computer Networks and Communications