Abstract
We consider causal inference in observational studies with choice-based sampling, in which subject enrollment is stratified on treatment choice. Choice-based sampling has been considered mainly in the econometrics literature, but it can be useful for biomedical studies as well, especially when one of the treatments being compared is uncommon. We propose new methods for estimating the population average treatment effect under choice-based sampling, including doubly robust methods motivated by semiparametric theory. A doubly robust, locally efficient estimator may be obtained by replacing nuisance functions in the efficient influence function with estimates based on parametric models. The use of machine learning methods to estimate nuisance functions leads to estimators that are consistent and asymptotically efficient under broader conditions. The methods are compared in simulation experiments and illustrated in the context of a large observational study in obstetrics. We also make suggestions on how to choose the target proportion of treated subjects and the sample size in designing a choice-based observational study.
Original language | English |
---|---|
Article number | 20180093 |
Journal | International Journal of Biostatistics |
Volume | 15 |
Issue number | 1 |
DOIs | |
Publication status | Published - May 2019 |
Keywords
- causal inference
- double robustness
- efficient influence function
- machine learning
- semiparametric theory
- super learner
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty