Simulated data with exposure misclassification and selection bias
Source:R/data_em_sel.R
df_em_sel.Rd
Data containing two sources of bias, three known confounders, and
100,000 observations. This data is obtained by sampling with replacement
with probability = S from df_em_sel_source
then removing the
columns X and S. The resulting data corresponds to what a
researcher would see in the real-world: a misclassified exposure,
Xstar, and missing data for those not selected into the study
(S=0). As seen in df_em_sel_source
, the true, unbiased
exposure-outcome odds ratio = 2.