Stable prediction in high-dimensional linear models

Bingqing Lin, Qihua Wang, Jun Zhang, Zhen Pang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

8 Citations (Scopus)

Abstract

We propose a Random Splitting Model Averaging procedure, RSMA, to achieve stable predictions in high-dimensional linear models. The idea is to use split training data to construct and estimate candidate models and use test data to form a second-level data. The second-level data is used to estimate optimal weights for candidate models by quadratic optimization under non-negative constraints. This procedure has three appealing features: (1) RSMA avoids model overfitting, as a result, gives improved prediction accuracy. (2) By adaptively choosing optimal weights, we obtain more stable predictions via averaging over several candidate models. (3) Based on RSMA, a weighted importance index is proposed to rank the predictors to discriminate relevant predictors from irrelevant ones. Simulation studies and a real data analysis demonstrate that RSMA procedure has excellent predictive performance and the associated weighted importance index could well rank the predictors.
Original languageEnglish
Pages (from-to)1401-1412
Number of pages12
JournalStatistics and Computing
Volume27
Issue number5
DOIs
Publication statusPublished - 1 Sep 2017

Keywords

  • Model averaging
  • Penalized regression
  • Screening
  • Variable selection

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Computational Theory and Mathematics

Cite this