TY - JOUR
T1 - Design of multi-view based email classification for IoT systems via semi-supervised learning
AU - Li, Wenjuan
AU - Meng, Weizhi
AU - Tan, Zhiyuan
AU - Xiang, Yang
N1 - Funding Information:
Dr. Meng was partially supported by H2020 SU-ICT-03- 2018 CyberSec4Europe .
Funding Information:
Yang Xiang received his PhD in Computer Science from Deakin University, Australia. He is the Dean of Digital Research & Innovation Capability Platform, Swinburne University of Technology, Australia. His research interests include cyber security, which covers network and system security, data analytics, distributed systems, and networking. In particular, he is currently leading his team developing active defense systems against large-scale distributed network attacks. He is the Chief Investigator of several projects in network and system security, funded by the Australian Research Council (ARC). He has published more than 200 research papers in many international journals and conferences, such as IEEE Transactions on Computers, IEEE Transactions on Parallel and Distributed Systems, IEEE Transactions on Information Security and Forensics, and IEEE Journal on Selected Areas in Communications. He served as the Associate Editor of IEEE Transactions on Computers, IEEE Transactions on Parallel and Distributed Systems, Security and Communication Networks (Wiley), and the Editor of Journal of Network and Computer Applications. He is the Coordinator, Asia for IEEE Computer Society Technical Committee on Distributed Processing (TCDP). He is a Senior Member of the IEEE.
Publisher Copyright:
© 2018 Elsevier Ltd
PY - 2019/2/15
Y1 - 2019/2/15
N2 - Suspicious emails are one big threat for Internet of Things (IoT) security, which aim to induce users to click and then redirect them to a phishing webpage. To protect IoT systems, email classification is an essential mechanism to classify spam and legitimate emails. In the literature, most email classification approaches adopt supervised learning algorithms that require a large number of labeled data for classifier training. However, data labeling is very time consuming and expensive, making only a very small set of data available in practice, which would greatly degrade the effectiveness of email classification. To mitigate this problem, in this work, we develop an email classification approach based on multi-view disagreement-based semi-supervised learning. The idea behind is that multi-view method can offer richer information for classification, which is often ignored by the literature. The use of semi-supervised learning can help leverage both labeled and unlabeled data. In the evaluation, we investigate the performance of our proposed approach with two datasets and in a real network environment. Experimental results demonstrate that the use of multi-view data can achieve more accurate email classification than the use of single-view data, and that our approach is more effective as compared to several existing similar algorithms.
AB - Suspicious emails are one big threat for Internet of Things (IoT) security, which aim to induce users to click and then redirect them to a phishing webpage. To protect IoT systems, email classification is an essential mechanism to classify spam and legitimate emails. In the literature, most email classification approaches adopt supervised learning algorithms that require a large number of labeled data for classifier training. However, data labeling is very time consuming and expensive, making only a very small set of data available in practice, which would greatly degrade the effectiveness of email classification. To mitigate this problem, in this work, we develop an email classification approach based on multi-view disagreement-based semi-supervised learning. The idea behind is that multi-view method can offer richer information for classification, which is often ignored by the literature. The use of semi-supervised learning can help leverage both labeled and unlabeled data. In the evaluation, we investigate the performance of our proposed approach with two datasets and in a real network environment. Experimental results demonstrate that the use of multi-view data can achieve more accurate email classification than the use of single-view data, and that our approach is more effective as compared to several existing similar algorithms.
KW - Disagreement-based learning
KW - Email classification
KW - IoT security
KW - Multi-view data
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85059001152&partnerID=8YFLogxK
U2 - 10.1016/j.jnca.2018.12.002
DO - 10.1016/j.jnca.2018.12.002
M3 - Journal article
AN - SCOPUS:85059001152
SN - 1084-8045
VL - 128
SP - 56
EP - 63
JO - Journal of Network and Computer Applications
JF - Journal of Network and Computer Applications
ER -