TY - JOUR
T1 - An Evolutionary Study of IoT Malware
AU - Wang, Huanran
AU - Zhang, Weizhe
AU - He, Hui
AU - Liu, Peng
AU - Luo, Daniel Xiapu
AU - Liu, Yang
AU - Jiang, Jiawei
AU - Li, Yan
AU - Zhang, Xing
AU - Liu, Wenmao
AU - Zhang, Runzi
AU - Lan, Xing
N1 - Funding Information:
This work was supported in part by the Key-Area Research and Development Program for Guangdong Province under Grant 2019B010136001; in part by the National Key Research and Development Plan under Grant 2017YFB0801801; in part by the National Natural Science Foundation of China (NSFC) under Grant 61672186 and Grant 61872110; and in part by HK RGC Project under Grant PolyU 152239/18E.
Publisher Copyright:
© 2014 IEEE.
PY - 2021/10/15
Y1 - 2021/10/15
N2 - Recent years have witnessed lots of attacks targeted at the widespread Internet of Things (IoT) devices and malicious activities conducted by compromised IoT devices. After some notorious IoT malware released their source code, many new variants emerge, which are usually more powerful and stealthy. Although numerous existing studies have analyzed some exposed families, there is a lack of systematic study to make full use of them, which can be a fundamental step for provenance, triage, labeling, lineage analysis, and authorship attribution. The key challenge of conducting an IoT malware evolutionary study is how to collect sufficient and accurate information about malware and identify the relationships among them. In this article, we take the first step to investigate the IoT malware evolution by leveraging the information from two sources that complement each other. First, we crawl online articles about IoT malware and employ natural language processing techniques to extract the features of malware samples and their relationships with other malware family, which allow us to form the basic lineage graph. Second, we collect real malware samples through our widely deployed honeypots and design a new classifier to group them into families and identify lineage relationships among them. Such results are used to enhance the basic lineage graph. Eventually, we construct the final lineage graph for 72 IoT malware families by correlating the information from the aforementioned sources, which can help the research community better understand and fight IoT malware now and in the future. Our study has been incorporated into the threat awareness system of NSFOCUS company.
AB - Recent years have witnessed lots of attacks targeted at the widespread Internet of Things (IoT) devices and malicious activities conducted by compromised IoT devices. After some notorious IoT malware released their source code, many new variants emerge, which are usually more powerful and stealthy. Although numerous existing studies have analyzed some exposed families, there is a lack of systematic study to make full use of them, which can be a fundamental step for provenance, triage, labeling, lineage analysis, and authorship attribution. The key challenge of conducting an IoT malware evolutionary study is how to collect sufficient and accurate information about malware and identify the relationships among them. In this article, we take the first step to investigate the IoT malware evolution by leveraging the information from two sources that complement each other. First, we crawl online articles about IoT malware and employ natural language processing techniques to extract the features of malware samples and their relationships with other malware family, which allow us to form the basic lineage graph. Second, we collect real malware samples through our widely deployed honeypots and design a new classifier to group them into families and identify lineage relationships among them. Such results are used to enhance the basic lineage graph. Eventually, we construct the final lineage graph for 72 IoT malware families by correlating the information from the aforementioned sources, which can help the research community better understand and fight IoT malware now and in the future. Our study has been incorporated into the threat awareness system of NSFOCUS company.
KW - Ensemble learning
KW - evolutionary study
KW - lineage inferring
KW - malware detection
KW - secure Internet of Things (IoT)
UR - http://www.scopus.com/inward/record.url?scp=85102285813&partnerID=8YFLogxK
U2 - 10.1109/JIOT.2021.3063840
DO - 10.1109/JIOT.2021.3063840
M3 - Journal article
AN - SCOPUS:85102285813
SN - 2327-4662
VL - 8
SP - 15422
EP - 15440
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 20
ER -