TY - JOUR
T1 - Causal Inference with Complex Surveys
T2 - A Unified Perspective on Sample Selection and Exposure Selection
AU - Nattino, Giovanni
AU - Ashmead, Robert
AU - Lu, Bo
N1 - Publisher Copyright:
© 2024 The Author(s). Published with license by Taylor & Francis Group, LLC.
PY - 2025
Y1 - 2025
N2 - Probability surveys are a major source of population representative data for policy research and program evaluation. However, the data come with the added complications of being observational and selected with unequal probabilities. Propensity score adjustments have become increasingly popular for inferring causal relationships in non-randomized studies, but when using survey data, estimates of the population level causal effect may be biased if the sampling design is not adequately adjusted for. The current practice of using propensity score estimators with complex surveys is somewhat ad-hoc. We propose a potential-outcome super-population framework to streamline the causal analysis. We also develop propensity-score-and-survey weighted estimators and corresponding variance estimators, as well as their asymptotic properties. Our framework clarifies the confusion regarding the use of survey weighted propensity score in practice. The choice actually depends on the available sampling weights. Various estimators are compared in a simulation study, which shows that the proposed estimators perform better than the competing methods in terms of bias and confidence interval coverage when treatment effects are heterogeneous. To address an important public health issue, we evaluate the impact of e-cigarette use on future tobacco use intention in teens, using a large nationally representative survey in the United States.
AB - Probability surveys are a major source of population representative data for policy research and program evaluation. However, the data come with the added complications of being observational and selected with unequal probabilities. Propensity score adjustments have become increasingly popular for inferring causal relationships in non-randomized studies, but when using survey data, estimates of the population level causal effect may be biased if the sampling design is not adequately adjusted for. The current practice of using propensity score estimators with complex surveys is somewhat ad-hoc. We propose a potential-outcome super-population framework to streamline the causal analysis. We also develop propensity-score-and-survey weighted estimators and corresponding variance estimators, as well as their asymptotic properties. Our framework clarifies the confusion regarding the use of survey weighted propensity score in practice. The choice actually depends on the available sampling weights. Various estimators are compared in a simulation study, which shows that the proposed estimators perform better than the competing methods in terms of bias and confidence interval coverage when treatment effects are heterogeneous. To address an important public health issue, we evaluate the impact of e-cigarette use on future tobacco use intention in teens, using a large nationally representative survey in the United States.
KW - Clustered survey
KW - Exposure selection
KW - Propensity score weighting
KW - Sample selection
KW - Tobacco initiation
UR - http://www.scopus.com/inward/record.url?scp=105002969320&partnerID=8YFLogxK
U2 - 10.1080/00031305.2024.2423814
DO - 10.1080/00031305.2024.2423814
M3 - Article
AN - SCOPUS:105002969320
SN - 0003-1305
VL - 79
SP - 173
EP - 183
JO - American Statistician
JF - American Statistician
IS - 2
ER -