Optimizing Count Responses in Surveys: A Machine-learning Approach

Qiang Fu, Xin Guo, Kenneth C. Land

Research output: Journal article publicationJournal articleAcademic researchpeer-review

8 Citations (Scopus)

Abstract

Count responses with grouping and right censoring have long been used in surveys to study a variety of behaviors, status, and attitudes. Yet grouping or right-censoring decisions of count responses still rely on arbitrary choices made by researchers. We develop a new method for evaluating grouping and right-censoring decisions of count responses from a (semisupervised) machine-learning perspective. This article uses Poisson multinomial mixture models to conceptualize the data-generating process of count responses with grouping and right censoring and demonstrates the link between grouping-scheme choices and asymptotic distributions of the Poisson mixture. To search for the optimal grouping scheme maximizing objective functions of the Fisher information (matrix), an innovative three-step M algorithm is then proposed to process infinitely many grouping schemes based on Bayesian A-, D-, and E-optimalities. A new R package is developed to implement this algorithm and evaluate grouping schemes of count responses. Results show that an optimal grouping scheme not only leads to a more efficient sampling design but also outperforms a nonoptimal one even if the latter has more groups.
Original languageEnglish
Pages (from-to)637-671
Number of pages35
JournalSociological Methods and Research
Volume49
Issue number3
DOIs
Publication statusPublished - Aug 2020

Keywords

  • experimental design
  • fisher information
  • machine learning
  • optimality
  • poisson distribution
  • right censoring
  • search algorithm
  • survey methodology
  • zero inflation

ASJC Scopus subject areas

  • Social Sciences (miscellaneous)
  • Sociology and Political Science

Fingerprint

Dive into the research topics of 'Optimizing Count Responses in Surveys: A Machine-learning Approach'. Together they form a unique fingerprint.

Cite this