Online Inference in High-Dimensional Generalized Linear Models with Streaming Data

Lan Luo, Ruijian Han, Yuanyuan Lin, Jian Huang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

In this paper we develop an online statistical inference approach for high-dimensional generalized linear models with streaming data for real-time estimation and inference. We propose an online debiased lasso method that aligns with the data collection scheme of streaming data. Online de-biased lasso differs from offline debiased lasso in two important aspects. First, it updates component-wise confidence intervals of regression coeffi-cients with only summary statistics of the historical data. Second, online debiased lasso adds an additional term to correct approximation errors ac-cumulated throughout the online updating procedure. We show that our proposed online debiased estimators in generalized linear models are asymptotically normal. This result provides a theoretical basis for carrying out real-time interim statistical inference with streaming data. Extensive numerical experiments are conducted to evaluate the performance of our proposed online debiased lasso method. These experiments demonstrate the effectiveness of our algorithm and support the theoretical results. Further-more, we illustrate the application of our method with a high-dimensional text dataset.

Original languageEnglish
Pages (from-to)3443-3471
Number of pages29
JournalElectronic Journal of Statistics
Volume17
Issue number2
DOIs
Publication statusPublished - Jan 2023

Keywords

  • Confidence interval
  • generalized linear models
  • high-dimensional data
  • online debiased lasso

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Online Inference in High-Dimensional Generalized Linear Models with Streaming Data'. Together they form a unique fingerprint.

Cite this