TY - JOUR
T1 - Analysis of combined probability and nonprobability samples
T2 - a simulation evaluation and application to a teen smoking behavior survey
AU - Xi, Wenna
AU - Hinton, Alice
AU - Lu, Bo
AU - Krotki, Karol
AU - Keller-Hamilton, Brittney
AU - Ferketich, Amy
AU - Sukasih, Amang
N1 - Publisher Copyright:
© 2022 Taylor & Francis Group, LLC.
PY - 2022/7/25
Y1 - 2022/7/25
N2 - In scientific studies with low-prevalence outcomes, probability sampling may be supplemented by nonprobability sampling to boost the sample size of desired subpopulation while remaining representative to the entire study population. To utilize both probability and nonprobability samples appropriately, several methods have been proposed in the literature to generate pseudo-weights, including ad-hoc weights, inclusion probability adjusted weights, and propensity score adjusted weights. We empirically compare various weighting strategies via an extensive simulation study, where probability and nonprobability samples are combined. Weight normalization and raking adjustment are also considered. Our simulation results suggest that the unity weight method (with weight normalization) and the inclusion probability adjusted weight method yield very good overall performance. This work is motivated by the Buckeye Teen Health Study, which examines risk factors for the initiation of smoking among teenage males in Ohio. To address the low response rate in the initial probability sample and low prevalence of smokers in the target population, a small convenience sample was collected as a supplement. Our proposed method yields estimates very close to the ones from the analysis using only the probability sample and enjoys the additional benefit of being able to track more teens with risky behaviors through follow-ups.
AB - In scientific studies with low-prevalence outcomes, probability sampling may be supplemented by nonprobability sampling to boost the sample size of desired subpopulation while remaining representative to the entire study population. To utilize both probability and nonprobability samples appropriately, several methods have been proposed in the literature to generate pseudo-weights, including ad-hoc weights, inclusion probability adjusted weights, and propensity score adjusted weights. We empirically compare various weighting strategies via an extensive simulation study, where probability and nonprobability samples are combined. Weight normalization and raking adjustment are also considered. Our simulation results suggest that the unity weight method (with weight normalization) and the inclusion probability adjusted weight method yield very good overall performance. This work is motivated by the Buckeye Teen Health Study, which examines risk factors for the initiation of smoking among teenage males in Ohio. To address the low response rate in the initial probability sample and low prevalence of smokers in the target population, a small convenience sample was collected as a supplement. Our proposed method yields estimates very close to the ones from the analysis using only the probability sample and enjoys the additional benefit of being able to track more teens with risky behaviors through follow-ups.
KW - Buckeye Teen Health Study
KW - Low-prevalence outcomes
KW - Nonprobability sampling
KW - Probability sampling
KW - Propensity score
KW - Pseudo-weight
UR - http://www.scopus.com/inward/record.url?scp=85134764806&partnerID=8YFLogxK
U2 - 10.1080/03610918.2022.2102181
DO - 10.1080/03610918.2022.2102181
M3 - Article
AN - SCOPUS:85134764806
SN - 0361-0918
VL - 53
SP - 3285
EP - 3301
JO - Communications in Statistics: Simulation and Computation
JF - Communications in Statistics: Simulation and Computation
IS - 7
ER -