An ensemble embedded feature selection method for multi-label clinical text classification

Yumeng Guo, Fu Lai Korris Chung, Guozheng Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

9 Citations (Scopus)


Clinical data records a patient's health status, where multi-label type of data exists. For example, a patient suffering from cough and fever should be associated with both two disease labels in the clinical records. Specifically, due to the redundant or irrelevant features in clinical data, the performance of multi-label classification will be limited, therefore selecting effective features from the feature space is necessary. However, few methods have been proposed to deal with multi-label feature selection problem in the past few years, which now only adopt a simple and direct strategy which transforms the multi-label feature selection problem into more single-label ones and ignore correlations among different labels. In this paper, a novel method named ensemble embedded feature selection (EEFS) is proposed to handle multi-label clinical data learning problem in a more effective and efficient way. EEFS does not explicitly find out the correlations among labels, but it can adequately utilize the label correlations by multi-label classifiers and evaluation measures. Furthermore, It can reduce the accumulated errors of data itself by employing ensemble method. Experimental results on clinical dataset show that our algorithm achieves significant superiority over other state-of-the-art algorithms.
Original languageEnglish
Title of host publicationProceedings - 2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016
Number of pages4
ISBN (Electronic)9781509016105
Publication statusPublished - 17 Jan 2017
Event2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016 - Shenzhen, China
Duration: 15 Dec 201618 Dec 2016


Conference2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016


  • Clinical text classification
  • Ensemble embedded feature selection
  • Multi-label learning

ASJC Scopus subject areas

  • Genetics
  • Medicine (miscellaneous)
  • Genetics(clinical)
  • Biochemistry, medical
  • Biochemistry
  • Molecular Medicine
  • Health Informatics

Cite this