Abstract
This paper studies the dimension effect of the linear discriminant analysis (LDA) and the regularized linear discriminant analysis (RLDA) classifiers for large dimensional data where the observation dimension p is of the same order as the sample size n. More specifically, built on properties of the Wishart distribution and recent results in random matrix theory, we derive explicit expressions for the asymptotic misclassification errors of LDA and RLDA respectively, from which we gain insights of how dimension affects the performance of classification and in what sense. Motivated by these results, we propose adjusted classifiers by correcting the bias brought by the unequal sample sizes. The bias-corrected LDA and RLDA classifiers are shown to have smaller misclassification rates than LDA and RLDA respectively. Several interesting examples are discussed in detail and the theoretical results on dimension effect are illustrated via extensive simulation studies.
Original language | English |
---|---|
Pages (from-to) | 2709-2742 |
Number of pages | 34 |
Journal | Electronic Journal of Statistics |
Volume | 12 |
Issue number | 2 |
DOIs | |
Publication status | Published - Jan 2018 |
Keywords
- Dimension effect
- Linear discriminant analysis
- Random matrix theory
- Regularized linear discriminant analysis
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty