Adust for exposure misclassification and outcome misclassification.

adjust_em_om returns the exposure-outcome odds ratio and confidence interval, adjusted for exposure misclassification and outcome misclassification.

Usage

adjust_em_om(
  data_observed,
  data_validation = NULL,
  x_model_coefs = NULL,
  y_model_coefs = NULL,
  x1y0_model_coefs = NULL,
  x0y1_model_coefs = NULL,
  x1y1_model_coefs = NULL,
  level = 0.95
)

Arguments

data_observed: Object of class data_observed corresponding to the data to perform bias analysis on.
data_validation: Object of class data_validation corresponding to the validation data used to adjust for bias in the observed data. Here, the validation data should have data for the same variables as in the observed data, plus data for the true and misclassified exposure and outcome corresponding to the observed exposure and outcome in data_observed.
x_model_coefs: The regression coefficients corresponding to the model: logit(P(X=1)) = δ₀ + δ₁X* + δ₂Y* + δ_2+jC_j, where X represents the binary true exposure, X* is the binary misclassified exposure, Y* is the binary misclassified outcome, C represents the vector of measured confounders (if any), and j corresponds to the number of measured confounders. The number of parameters is therefore 3 + j.
y_model_coefs: The regression coefficients corresponding to the model: logit(P(Y=1)) = β₀ + β₁X + β₂Y* + β_2+jC_j, where Y represents the binary true outcome, X is the binary exposure, Y* is the binary misclassified outcome, C represents the vector of measured confounders (if any), and j corresponds to the number of measured confounders. The number of parameters is therefore 3 + j.
x1y0_model_coefs: The regression coefficients corresponding to the model: log(P(X=1,Y=0) / P(X=0,Y=0)) = γ_1,0 + γ_1,1X* + γ_1,2Y* + γ_1,2+jC_j, where X is the binary true exposure, Y is the binary true outcome, X* is the binary misclassified exposure, Y* is the binary misclassified outcome, C represents the vector of measured confounders (if any), and j corresponds to the number of measured confounders.
x0y1_model_coefs: The regression coefficients corresponding to the model: log(P(X=0,Y=1) / P(X=0,Y=0)) = γ_2,0 + γ_2,1X* + γ_2,2Y* + γ_2,2+jC_j, where X is the binary true exposure, Y is the binary true outcome, X* is the binary misclassified exposure, Y* is the binary misclassified outcome, C represents the vector of measured confounders (if any), and j corresponds to the number of measured confounders.
x1y1_model_coefs: The regression coefficients corresponding to the model: log(P(X=1,Y=1) / P(X=0,Y=0)) = γ_3,0 + γ_3,1X* + γ_3,2Y* + γ_3,2+jC_j, where X is the binary true exposure, Y is the binary true outcome, X* is the binary misclassified exposure, Y* is the binary misclassified outcome, C represents the vector of measured confounders (if any), and j corresponds to the number of measured confounders.
level: Value from 0-1 representing the full range of the confidence interval. Default is 0.95.

Value

A list where the first item is the odds ratio estimate of the effect of the exposure on the outcome and the second item is the confidence interval as the vector: (lower bound, upper bound).

Details

Bias adjustment can be performed by inputting either a validation dataset or the necessary bias parameters. Two different options for the bias parameters are available here: 1) parameters from separate models of X and Y (x_model_coefs and y_model_coefs) or 2) parameters from a joint model of X and Y (x1y0_model_coefs, x0y1_model_coefs, and x1y1_model_coefs).

Values for the regression coefficients can be applied as fixed values or as single draws from a probability distribution (ex: rnorm(1, mean = 2, sd = 1)). The latter has the advantage of allowing the researcher to capture the uncertainty in the bias parameter estimates. To incorporate this uncertainty in the estimate and confidence interval, this function should be run in loop across bootstrap samples of the dataframe for analysis. The estimate and confidence interval would then be obtained from the median and quantiles of the distribution of odds ratio estimates.

Examples

df_observed <- data_observed(
  data = df_em_om,
  exposure = "Xstar",
  outcome = "Ystar",
  confounders = "C1"
)

# Using validation data -----------------------------------------------------
df_validation <- data_validation(
  data = df_em_om_source,
  true_exposure = "X",
  true_outcome = "Y",
  confounders = "C1",
  misclassified_exposure = "Xstar",
  misclassified_outcome = "Ystar"
)

adjust_em_om(
  data_observed = df_observed,
  data_validation = df_validation
)
#> $estimate
#> [1] 2.084552
#> 
#> $ci
#> [1] 2.004242 2.168081
#> 

# Using x_model_coefs and y_model_coefs -------------------------------------
adjust_em_om(
  data_observed = df_observed,
  x_model_coefs = c(-2.15, 1.64, 0.35, 0.38),
  y_model_coefs = c(-3.10, 0.63, 1.60, 0.39)
)
#> $estimate
#> [1] 2.076923
#> 
#> $ci
#> [1] 1.996799 2.160261
#> 

# Using x1y0_model_coefs, x0y1_model_coefs, and x1y1_model_coefs ------------
adjust_em_om(
  data_observed = df_observed,
  x1y0_model_coefs = c(-2.18, 1.63, 0.23, 0.36),
  x0y1_model_coefs = c(-3.17, 0.22, 1.60, 0.40),
  x1y1_model_coefs = c(-4.76, 1.82, 1.83, 0.72)
)
#> $estimate
#> [1] 2.049521
#> 
#> $ci
#> [1] 1.970392 2.131827
#>