TY - JOUR
T1 - MalRadar: Demystifying Android Malware in the New Era
AU - Wang, Liu
AU - Wang, Haoyu
AU - He, Ren
AU - Tao, Ran
AU - Meng, Guozhu
AU - Luo, Xiapu
AU - Liu, Xuanzhe
N1 - Funding Information:
We sincerely thank our shepherd Prof. Niklas Carlsson (Linkoping University) and all the anonymous reviewers for their valuable suggestions and comments to improve this paper. This work was supported by the National Natural Science Foundation of China (grants No.62072046, 61873069, 62102042, 61902395) and Hong Kong RGC Projects (PolyU15223918, PolyU15224121). Xuanzhe’s work was supported by Alibaba Group through Alibaba Innovative Research (AIR) Program.
Publisher Copyright:
© 2022 ACM.
PY - 2022/6/2
Y1 - 2022/6/2
N2 - Mobile malware detection has attracted massive research effort in our community. A reliable and up-to-date malware dataset is critical to evaluate the effectiveness of malware detection approaches. Essentially, the malware ground truth should be manually verified by security experts, and their malicious behaviors should be carefully labelled. Although there are several widely-used malware benchmarks in our community (e.g., MalGenome, Drebin, Piggybacking and AMD, etc.), these benchmarks face several limitations including out-of-date, size, coverage, and reliability issues, etc. In this paper, we first make efforts to create MalRadar, a growing and up-to-date Android malware dataset using the most reliable way, i.e., by collecting malware based on the analysis reports of security experts. We have crawled all the mobile security related reports released by ten leading security companies, and used an automated approach to extract and label the useful ones describing new Android malware and containing Indicators of Compromise (IoC) information. We have successfully compiled MalRadar, a dataset that contains 4,534 unique Android malware samples (including both apks and metadata) released from 2014 to April 2021 by the time of this paper, all of which were manually verified by security experts with detailed behavior analysis. Then we characterize the MalRadar dataset from malware distribution channels, app installation methods, malware activation, malicious behaviors and anti-analysis techniques. We further investigate the malware evolution over the last decade. At last, we measure the effectiveness of commercial anti-virus engines and malware detection techniques on detecting malware in MalRadar. Our dataset can be served as the representative Android malware benchmark in the new era, and our observations can positively contribute to the community and boost a series of research studies on mobile security.
AB - Mobile malware detection has attracted massive research effort in our community. A reliable and up-to-date malware dataset is critical to evaluate the effectiveness of malware detection approaches. Essentially, the malware ground truth should be manually verified by security experts, and their malicious behaviors should be carefully labelled. Although there are several widely-used malware benchmarks in our community (e.g., MalGenome, Drebin, Piggybacking and AMD, etc.), these benchmarks face several limitations including out-of-date, size, coverage, and reliability issues, etc. In this paper, we first make efforts to create MalRadar, a growing and up-to-date Android malware dataset using the most reliable way, i.e., by collecting malware based on the analysis reports of security experts. We have crawled all the mobile security related reports released by ten leading security companies, and used an automated approach to extract and label the useful ones describing new Android malware and containing Indicators of Compromise (IoC) information. We have successfully compiled MalRadar, a dataset that contains 4,534 unique Android malware samples (including both apks and metadata) released from 2014 to April 2021 by the time of this paper, all of which were manually verified by security experts with detailed behavior analysis. Then we characterize the MalRadar dataset from malware distribution channels, app installation methods, malware activation, malicious behaviors and anti-analysis techniques. We further investigate the malware evolution over the last decade. At last, we measure the effectiveness of commercial anti-virus engines and malware detection techniques on detecting malware in MalRadar. Our dataset can be served as the representative Android malware benchmark in the new era, and our observations can positively contribute to the community and boost a series of research studies on mobile security.
KW - Android malware
KW - Dataset
KW - Malware evolution
KW - Security reports
UR - http://www.scopus.com/inward/record.url?scp=85131700538&partnerID=8YFLogxK
U2 - 10.1145/3530906
DO - 10.1145/3530906
M3 - Journal article
SN - 2476-1249
VL - 6
SP - 1
EP - 27
JO - Proceedings of the ACM on Measurement and Analysis of Computing Systems
JF - Proceedings of the ACM on Measurement and Analysis of Computing Systems
IS - 2
M1 - 3530906
ER -