TY - JOUR
T1 - Functional Martingale Residual Process for High-dimensional Cox Regression with Model Averaging
AU - He, Baihua
AU - Liu, Yanyan
AU - Wu, Yuanshan
AU - Yin, Guosheng
AU - Zhao, Xingqiu
N1 - Funding Information:
We thank the action editor and reviewers for their many constructive comments that strengthened the work immensely. This research was supported in part by the National Natural Science Foundation of China (Projects 11971362, 11671311, 12071483, 11771366) and the Research Grants Council of Hong Kong (Projects 17307218, 15301218, 15303319). The corresponding author is Yuanshan Wu.
Publisher Copyright:
© 2020 Baihua He, Yanyan Liu, Yuanshan Wu, Guosheng, Yin and Xingqiu, Zhao.
PY - 2020/10
Y1 - 2020/10
N2 - Regularization methods for the Cox proportional hazards regression with high-dimensional survival data have been studied extensively in the literature. However, if the model is misspeci fied, this would result in misleading statistical inference and prediction. To enhance the prediction accuracy for the relative risk and the survival probability, we propose three model averaging approaches for the high-dimensional Cox proportional hazards regression. Based on the martingale residual process, we define the delete-one cross-validation (CV) process, and further propose three novel CV functionals, including the end-time CV, integrated CV, and supremum CV, to achieve more accurate prediction for the risk quantities of clinical interest. The optimal weights for candidate models, without the constraint of summing up to one, can be obtained by minimizing these functionals, respectively. The proposed model averaging approach can attain the lowest possible prediction loss asymptotically. Furthermore, we develop a greedy model averaging algorithm to overcome the computational obstacle when the dimension is high. The performances of the proposed model averaging procedures are evaluated via extensive simulation studies, demonstrating that our methods achieve superior prediction accuracy over the existing regularization methods. As an illustration, we apply the proposed methods to the mantle cell lymphoma study.
AB - Regularization methods for the Cox proportional hazards regression with high-dimensional survival data have been studied extensively in the literature. However, if the model is misspeci fied, this would result in misleading statistical inference and prediction. To enhance the prediction accuracy for the relative risk and the survival probability, we propose three model averaging approaches for the high-dimensional Cox proportional hazards regression. Based on the martingale residual process, we define the delete-one cross-validation (CV) process, and further propose three novel CV functionals, including the end-time CV, integrated CV, and supremum CV, to achieve more accurate prediction for the risk quantities of clinical interest. The optimal weights for candidate models, without the constraint of summing up to one, can be obtained by minimizing these functionals, respectively. The proposed model averaging approach can attain the lowest possible prediction loss asymptotically. Furthermore, we develop a greedy model averaging algorithm to overcome the computational obstacle when the dimension is high. The performances of the proposed model averaging procedures are evaluated via extensive simulation studies, demonstrating that our methods achieve superior prediction accuracy over the existing regularization methods. As an illustration, we apply the proposed methods to the mantle cell lymphoma study.
KW - Asymptotic Optimality
KW - Censored Data
KW - Cross Validation
KW - Greedy Algorithm
KW - Martingale Residual Process
KW - Prediction
KW - Survival Analysis
UR - http://www.scopus.com/inward/record.url?scp=85094901242&partnerID=8YFLogxK
M3 - Journal article
AN - SCOPUS:85094901242
SN - 1532-4435
VL - 21
SP - 1
EP - 37
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
ER -