Enhancing email classification using data reduction and disagreement-based semi-supervised learning

Yuxin Meng, Wenjuan Li, Lam For Kwok

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

15 Citations (Scopus)

Abstract

Email classification is an important topic in literature attempting to correctly classify user emails and filter out spam emails. In this paper, we identify some challenges regarding this topic and propose an effective email classification model based on both data reduction and disagreement-based semi-supervised learning. In particular, the main objective of the data reduction is to select an optimum collection of email features and reduce the pointless data, while the objective of the disagreement-based approach is to enhance the accuracy of detecting spam emails by utilizing unlabeled data automatically. In the evaluation, we explore the performance of our proposed email classification model using two public datasets and a private dataset. The experimental results demonstrate that our proposed model can overall enhance the performance of email classification through improving detection accuracy and reducing false rates.

Original languageEnglish
Title of host publication2014 IEEE International Conference on Communications, ICC 2014
PublisherIEEE Computer Society
Pages622-627
Number of pages6
ISBN (Print)9781479920037
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event2014 1st IEEE International Conference on Communications, ICC 2014 - Sydney, NSW, Australia
Duration: 10 Jun 201414 Jun 2014

Publication series

Name2014 IEEE International Conference on Communications, ICC 2014

Conference

Conference2014 1st IEEE International Conference on Communications, ICC 2014
Country/TerritoryAustralia
CitySydney, NSW
Period10/06/1414/06/14

Keywords

  • Data Reduction
  • Disagreement-based Semi-Supervised Learning
  • Email Classification
  • Machine Learning

ASJC Scopus subject areas

  • Computer Networks and Communications

Cite this