Abstract
© 2018, © 2018 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. High-dimension-low-sample size statistical analysis is important in a wide range of applications. In such situations, the highly appealing discrimination method, support vector machine, can be improved to alleviate data piling at the margin. This leads naturally to the development of distance weighted discrimination (DWD), which can be modeled as a second-order cone programming problem and solved by interior-point methods when the scale (in sample size and feature dimension) of the data is moderate. Here, we design a scalable and robust algorithm for solving large-scale generalized DWD problems. Numerical experiments on real datasets from the UCI repository demonstrate that our algorithm is highly efficient in solving large-scale problems, and sometimes even more efficient than the highly optimized LIBLINEAR and LIBSVM for solving the corresponding SVM problems. Supplementary material for this article is available online.
Original language | English |
---|---|
Pages (from-to) | 368-379 |
Number of pages | 12 |
Journal | Journal of Computational and Graphical Statistics |
Volume | 27 |
Issue number | 2 |
DOIs | |
Publication status | Published - 3 Apr 2018 |
Keywords
- Convergent multi-block ADMM
- Data piling
- Support vector machine
ASJC Scopus subject areas
- Statistics and Probability
- Discrete Mathematics and Combinatorics
- Statistics, Probability and Uncertainty