find_max_effect_mods_rct.RdThis function estimates effect modification in an RCT (with known randomization
probability \(\)) under two different parameters, determined by rct_type:
If rct_type = "ate", we estimate a subject-level \(ATE\),
meaning we look at \(Q(1,W_i) - Q(0,W_i)\). Then we do a
TMLE-style update (with known \(\)) to get the influence function for the ATE.
The script then performs a data-adaptive partition to find subpopulations with
the largest ATE difference.
If rct_type = "incps", we do a two-stage incremental-propensity-shift
approach, going from \(\) to \(+\). We produce subject-level
differences \(Q,+(i) - Q,(i)\),
and partition on this "shift effect."
Note: If you want more valid subpopulation inference, do sample splitting or cross-validation externally. The p-values from a single pass can be too optimistic.
find_max_effect_mods_rct(
at,
av,
delta,
a_name,
w_names,
outcome,
outcome_type,
mu_learner,
alpha = NULL,
top_n = 3,
seed,
min_obs,
fold,
max_depth = 2,
pval_thresh = 0.05,
rct_type = c("ate", "incps")
)A training fold data.frame, with columns w_names, a_name, outcome.
A validation fold data.frame (or the same set, if single pass).
A numeric scalar for the incremental coverage shift \( + \).
Name of the binary exposure (e.g. "A").
Character vector of baseline covariate names.
Name of the outcome variable.
"continuous","binary","count" (for sl3 tasks).
A list of sl3 learners for the outcome regression.
Known randomization prob; if NULL, we estimate it from at.
Number of top rules to return from the partition search.
Random seed for reproducibility.
Min # of obs in a valid split branch.
Label for fold index (for cross-validation).
Maximum depth of the partition search tree.
p-value threshold for accepting a split.
Either "ate" or "incps".
A list with:
Data frame with 2 rows per discovered region: one for \(V\), one for \(V^c\).
Either the subject-level ATE difference (\(Q(1)-Q(0)\)) or inc. shift difference in validation.
Corresponding influence function (or difference of shift’s IF) for each subject in validation.
Vector of av_q_estimates in the discovered region \(V\) (for the first discovered partition).
Vector of av_q_estimates outside that region (complement).
Vector of av_hn_estimates in \(V\).
Vector of av_hn_estimates in \(V^c\).
The rows in \(V\).
The rows in \(V^c\).
The full validation set (with appended columns) for post-hoc usage.