Estimate the Exposure Mechanism via Generalized Propensity Score for One Exposure Variable

indiv_stoch_shift_est_g_exp(
  exposure,
  delta,
  g_learner,
  covars,
  av,
  at,
  adaptive_delta,
  hn_trunc_thresh,
  use_multinomial,
  lower_bound,
  upper_bound,
  outcome_type,
  density_type,
  n_bins,
  max_degree
)

Arguments

exposure

A character representing the label of the exposure variable. This variable should be a column name in the input data.

delta

A numeric value specifying the shift in the observed value of the exposure for evaluating counterfactual observations. Positive values will result in upward shifts, while negative values will result in downward shifts.

g_learner

An object containing a set of instantiated learners from the sl3 package, to be used in fitting an ensemble model for GPS estimation. Learners should be chosen based on the structure of the data and the relationships between exposure, covariates, and outcome.

covars

A character vector representing the labels of the covariate variables. These variables should be column names in the input data and serve as control variables in the GPS estimation.

av

A dataframe containing validation data specific to the fold. This data is used to evaluate the performance of the GPS model during cross-validation.

at

A dataframe containing training data specific to the fold. This data is used to fit the GPS model during cross-validation.

adaptive_delta

A logical indicating whether to adaptively adjust delta based on positivity (estimated from the clever covariate) meeting the hn_trunc_thresh level. If TRUE, the function will adjust delta to ensure sufficient overlap in the GPS distributions.

hn_trunc_thresh

A numeric value specifying the level of the clever covariate in the adaptive delta procedure. It represents the minimum proportion of observations in each exposure group to be included when adjusting delta.

use_multinomial

TRUE/FALSE whether to use multinomial for PMF estimation vs. PDF of exposure

lower_bound

A numeric value specifying the lower bound of the exposure variable to prevent shifting past this limit during the GPS estimation.

upper_bound

A numeric value specifying the upper bound of the exposure variable to prevent shifting past this limit during the GPS estimation.

outcome_type

A character specifying whether the outcome is 'categorical' or 'continuous' based on the discretization of the exposure variable. This information is used to determine the appropriate learner type for the GPS model.

density_type

A character specifying the type of density estimator to use for GPS estimation. Possible options are 'SL' (Super Learner) or 'HAL' (Highly Adaptive Lasso).

n_bins

A numeric value specifying the number of bins to be used if the exposure variable is discretized. This parameter is only applicable when exposure_quantized is TRUE.

max_degree

A numeric value specifying the maximum degree of interactions to be used in the Highly Adaptive Lasso (HAL) if HAL is chosen as the density estimator. Higher values will result in a more flexible GPS model, but may increase the risk of overfitting.

Value

A data.table with four columns, containing estimates of the generalized propensity score at a downshift (g(A - delta | W)), no shift (g(A | W)), an upshift (g(A + delta) | W), and an upshift of magnitude two (g(A + 2 * delta) | W).

Details

This function computes the generalized propensity score (GPS) for the observed data with one exposure variable, considering different shift levels. It estimates the GPS at the observed data (at the observed A), and at the counterfactual shifted exposure levels (at A - delta, A + delta, and A + 2 * delta).