Consistent Screening Procedures in High-dimensional Binary Classification

Hangjin Jiang, Xingqiu Zhao, Ronald C.W. Ma, Xiaodan Fan

Research output: Journal article publicationJournal articleAcademic researchpeer-review

4 Citations (Scopus)

Abstract

We consider variable screening in high-dimensional binary classification. First, we propose nonparametric test statistics for the problem of the two-sample distribution comparison. These test statistics combine the merits of the chi-squared and Kolmogorov–Smirnov statistics, and provide new insights into the equality test of the unspecified distributions underlying the two independent samples. Based on our new statistics, we propose a marginal screening procedure and a pairwise joint screening procedure for detecting important variables in high-dimensional binary classification. Both screening procedures have the consistent screening property, which is stronger than the sure screening property of most existing methods. The marginal screening procedure is much more powerful than other methods over a broad range of cases, and the pairwise joint screening procedure provides a way of detecting variables with a joint effect, but no marginal effect. Extensive simulations and a real-data application show the effectiveness and advantages of the proposed methods.

Original languageEnglish
Pages (from-to)109-130
Number of pages22
JournalStatistica Sinica
Volume32
Issue number1
DOIs
Publication statusPublished - Jan 2022

Keywords

  • Binary classification
  • consistency
  • non-parametric test
  • Two-sample distribution comparison
  • variable screening

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Consistent Screening Procedures in High-dimensional Binary Classification'. Together they form a unique fingerprint.

Cite this