This function creates augmented data for both training and validation datasets by duplicating observations and applying a specified shift (delta) to the exposure variable. The function is useful for implementing stochastic interventions in causal inference studies.

create_augmented_data(at, av, delta, var, covars)

Arguments

at

Data frame containing the training data.

av

Data frame containing the validation data.

delta

Numeric value specifying the shift to be applied to the exposure variable.

var

Character string specifying the name of the exposure variable to be shifted.

covars

Character vector specifying the names of the covariate columns in the data frames.

Value

A list containing two elements: at_dup and av_dup, which are the augmented training and validation data frames, respectively. Each data frame has an additional column intervention

indicating whether the observation is under the original or shifted exposure.

Examples

# Example usage:
training_data <- data.frame(A = rnorm(100), W1 = rbinom(100, 1, 0.5), W2 = rbinom(100, 1, 0.5))
validation_data <- data.frame(A = rnorm(50), W1 = rbinom(50, 1, 0.5), W2 = rbinom(50, 1, 0.5))
delta <- -0.5
exposure_variable <- "A"
covariates <- c("W1", "W2")
result <- create_augmented_data(training_data, validation_data, delta, exposure_variable, covariates)
augmented_training_data <- result$at_dup
augmented_validation_data <- result$av_dup