Dynamic linear discriminant analysis in high dimensional space

Binyan Jiang, Ziqi Chen, Chenlei Leng

Research output: Journal article publicationJournal articleAcademic researchpeer-review

1 Citation (Scopus)

Abstract

High-dimensional data that evolve dynamically feature predominantly in the modern data era. As a partial response to this, recent years have seen increasing emphasis to address the dimensionality challenge. However, the non-static nature of these datasets is largely ignored. This paper addresses both challenges by proposing a novel yet simple dynamic linear programming discriminant (DLPD) rule for binary classification. Different from the usual static linear discriminant analysis, the new method is able to capture the changing distributions of the underlying populations by modeling their means and covariances as smooth functions of covariates of interest. Under an approximate sparse condition, we show that the conditional misclassification rate of the DLPD rule converges to the Bayes risk in probability uniformly over the range of the variables used for modeling the dynamics, when the dimensionality is allowed to grow exponentially with the sample size. The minimax lower bound of the estimation of the Bayes risk is also established, implying that the misclassification rate of our proposed rule is minimax-rate optimal. The promising performance of the DLPD rule is illustrated via extensive simulation studies and the analysis of a breast cancer dataset.

Original languageEnglish
Pages (from-to)1234-1268
Number of pages35
JournalBernoulli
Volume26
Issue number2
DOIs
Publication statusPublished - 31 Jan 2020

Keywords

  • Bayes rule
  • Discriminant analysis
  • Dynamic linear programming
  • High-dimensional data
  • Kernel estimation
  • Sparsity

ASJC Scopus subject areas

  • Statistics and Probability

Cite this