Interaction Identification and Estimation using Data-Adaptive Stochastic Interventions Authors: David McCoy


What’s InterXshift?

The InterXshift R package offers an approach which identifies and estimates the impact interactions in a mixed exposure on an outcome. We define interaction as the counterfactual mean of the outcome under stochastic interventions of two exposures compared to the additive counterfactual mean of the two expowures intervened on indepdentently. These interventions or exposure changes depend on naturally observed values, as described in past literature (Dı́az and van der Laan 2012; Haneuse and Rotnitzky 2013).

InterXshift builds on work described in (McCoy et al. 2023). However instead of identifying interactions through an semi-parametric definition of an F-statistics and then estimating our interaction target parameter using CV-TMLE pooled across exposure sets, we provide a more streamlined approach. In this package, we identify interactions through g-computation first - evaluating the expected outcome under joint shift compared to the sum of individual shifts using Super Learner. We then rank these estimates as the highest sets of synergy and antagonism. We then use CV-TMLE and pool within the ranks.

The package ensures robustness by employing a k-fold cross-validation framework. This framework helps in estimating a data-adaptive parameter, which is the stochastic shift target parameters for the exposure sets identified as having synergy or antagonism. The process begins by partitioning the data into parameter-generating and estimation samples. In the parameter-generating sample, we identify our ranks of antagonistic and synergistic exposure sets through a machine learning g-computation framework. In the estimation sample we then estimate our interaction target parameter using the doubly robust estimator TMLE to ensure asymptotic efficiency which allows us to construct confidence intervals for our estimates (unlike the g-comp method).

By using InterXshift, users get access to a tool that offers both k-fold specific and aggregated results for the top synergistic and antagonistic relationships, ensuring that researchers can glean the most information from their data. For a more in-depth exploration, there’s an accompanying vignette.

To utilize the package, users need to provide vectors for exposures, covariates, and outcomes. They also specify the respective δ for each exposure (indicating the degree of shift) and if this delta should be adaptive in response to positivity violations. The top_n parameter defines the top number of synergistic, antagonistic, positive and negative ranked impacts to estiamte. A detailed guide is provided in the vignette. With these inputs, InterXshift processes the data and delivers tables showcasing fold-specific results and aggregated outcomes, allowing users to glean insights effectively.

InterXshift also incorporates features from the sl3 package (Coyle, Hejazi, Malenica, et al. 2022), facilitating ensemble machine learning in the estimation process. If the user does not specify any stack parameters, InterXshift will automatically create an ensemble of machine learning algorithms that strike a balance between flexibility and computational efficiency.


Installation

Note: Because the InterXshift package (currently) depends on sl3 that allows ensemble machine learning to be used for nuisance parameter estimation and sl3 is not on CRAN the InterXshift package is not available on CRAN and must be downloaded here.

There are many depedencies for InterXshift so it’s easier to break up installation of the various packages to ensure proper installation.

First install the basis estimators used in the data-adaptive variable discovery of the exposure and covariate space:

InterXshift uses the sl3 package to build ensemble machine learners for each nuisance parameter.

remotes::install_github("tlverse/sl3@devel")

Make sure sl3 installs correctly then install InterXshift

remotes::install_github("blind-contours/InterXshift@main")

Example

To illustrate how InterXshift may be used to ascertain the effect of a mixed exposure, consider the following example:

library(InterXshift)
library(devtools)
#> Loading required package: usethis
library(kableExtra)
library(sl3)

seed <- 429153
set.seed(seed)

We will directly use synthetic data from the NIEHS used to test new mixture methods. This data has built in strong positive and negative marginal effects and certain interactions. Found here: https://github.com/niehs-prime/2015-NIEHS-MIxtures-Workshop

data("NIEHS_data_1", package = "InterXshift")
NIEHS_data_1$W <- rnorm(nrow(NIEHS_data_1), mean = 0, sd = 0.1)
w <- NIEHS_data_1[, c("W", "Z")]
a <- NIEHS_data_1[, c("X1", "X2", "X3", "X4", "X5", "X6", "X7")]
y <- NIEHS_data_1$Y

deltas <- list(
  "X1" = 1, "X2" = 1, "X3" = 1,
  "X4" = 1, "X5" = 1, "X6" = 1, "X7" = 1
)
head(NIEHS_data_1) %>%
  kbl(caption = "NIEHS Data") %>%
  kable_classic(full_width = F, html_font = "Cambria")
NIEHS Data
obs Y X1 X2 X3 X4 X5 X6 X7 Z W
1 7.534686 0.4157066 0.5308077 0.2223965 1.1592634 2.4577556 0.9438601 1.8714406 0 0.1335790
2 19.611934 0.5293572 0.9339570 1.1210595 1.3350074 0.3096883 0.5190970 0.2418065 0 0.0585291
3 12.664050 0.4849759 0.7210988 0.4629027 1.0334138 0.9492810 0.3664090 0.3502445 0 0.1342057
4 15.600288 0.8275456 1.0457137 0.9699040 0.9045099 0.9107914 0.4299847 1.0007901 0 0.0734320
5 18.606498 0.5190363 0.7802400 0.6142188 0.3729743 0.5038126 0.3575472 0.5906156 0 -0.0148427
6 18.525890 0.4009491 0.8639886 0.5501847 0.9011016 1.2907615 0.7990418 1.5097039 0 0.1749775

Based on the data key, we expect X1 to have the strongest positive effect, X5 the strongest negative. So we would expect these to take the top ranks for these marginal associations. For interactions we expect to find interactions built into the data that are antagonistic and synergistic. The github page gives details on the types of interactions built into this synthetic data. The above image shows the interaction types that we might expect to find.


ptm <- proc.time()
sim_results <- InterXshift(
  w = w,
  a = a,
  y = y,
  delta = deltas,
  n_folds = 3,
  num_cores = 6,
  outcome_type = "continuous",
  seed = seed,
  top_n = 2
)
#> 
#> Iter: 1 fn: 188.2217  Pars:  0.18996 0.81004
#> Iter: 2 fn: 188.2217  Pars:  0.18996 0.81004
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 376.1549  Pars:  0.15230 0.84770
#> Iter: 2 fn: 376.1549  Pars:  0.15230 0.84770
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 330.7270  Pars:  0.0000001547 0.9999998440
#> Iter: 2 fn: 330.7270  Pars:  0.00000009486 0.99999990514
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 395.5675  Pars:  0.9999995911 0.0000004086
#> Iter: 2 fn: 395.5675  Pars:  0.9999997531 0.0000002469
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 329.0977  Pars:  0.00000009094 0.99999990921
#> Iter: 2 fn: 329.0977  Pars:  0.00000005678 0.99999994322
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 394.9951  Pars:  0.97308 0.02692
#> Iter: 2 fn: 394.9951  Pars:  0.97315 0.02685
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 411.6276  Pars:  0.77643 0.22357
#> Iter: 2 fn: 411.6276  Pars:  0.77645 0.22355
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 396.0738  Pars:  0.9999991806 0.0000008196
#> Iter: 2 fn: 396.0738  Pars:  0.9999995607 0.0000004393
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 359.4551  Pars:  0.001522 0.998478
#> Iter: 2 fn: 358.7028  Pars:  0.57483 0.42517
#> Iter: 3 fn: 358.7028  Pars:  0.57483 0.42517
#> solnp--> Completed in 3 iterations
#> 
#> Iter: 1 fn: 271.4984  Pars:  0.05346 0.94654
#> Iter: 2 fn: 271.4984  Pars:  0.05345 0.94655
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 138.6347  Pars:  0.14848 0.85152
#> Iter: 2 fn: 138.6347  Pars:  0.14848 0.85152
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 66.5503   Pars:  0.0000001271 0.9999998736
#> Iter: 2 fn: 66.5503   Pars:  0.00000007528 0.99999992472
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: -5.0555   Pars:  0.44462 0.55538
#> Iter: 2 fn: -5.0555   Pars:  0.44462 0.55538
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 183.8985  Pars:  0.20871 0.79129
#> Iter: 2 fn: 183.8985  Pars:  0.20871 0.79129
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 383.7304  Pars:  0.15233 0.84767
#> Iter: 2 fn: 383.7304  Pars:  0.15232 0.84768
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 380.2020  Pars:  0.23886 0.76114
#> Iter: 2 fn: 380.2020  Pars:  0.23886 0.76114
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 183.6422  Pars:  0.03878 0.96122
#> Iter: 2 fn: 183.6422  Pars:  0.03878 0.96122
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 333.7236  Pars:  0.03373 0.96627
#> Iter: 2 fn: 333.7236  Pars:  0.03373 0.96627
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 341.5709  Pars:  0.21640 0.78360
#> Iter: 2 fn: 341.5709  Pars:  0.21640 0.78360
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 388.3596  Pars:  0.34865 0.65135
#> Iter: 2 fn: 388.3596  Pars:  0.34865 0.65135
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 342.6278  Pars:  0.60993 0.39007
#> Iter: 2 fn: 342.6278  Pars:  0.60876 0.39124
#> Iter: 3 fn: 342.6278  Pars:  0.60876 0.39124
#> solnp--> Completed in 3 iterations
#> 
#> Iter: 1 fn: 388.2628  Pars:  0.20855 0.79145
#> Iter: 2 fn: 388.2628  Pars:  0.20855 0.79145
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 393.7808  Pars:  0.58218 0.41782
#> Iter: 2 fn: 393.7808  Pars:  0.58218 0.41782
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 390.2628  Pars:  0.19613 0.80387
#> Iter: 2 fn: 390.2628  Pars:  0.19612 0.80388
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 360.8686  Pars:  0.999997556 0.000002444
#> Iter: 2 fn: 360.8686  Pars:  0.999998373 0.000001627
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 299.4745  Pars:  0.20485 0.79515
#> Iter: 2 fn: 299.4745  Pars:  0.20486 0.79514
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 185.8446  Pars:  0.03814 0.96186
#> Iter: 2 fn: 185.8446  Pars:  0.03814 0.96186
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 86.1185   Pars:  0.26299 0.73701
#> Iter: 2 fn: 86.1185   Pars:  0.26298 0.73702
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 18.4205   Pars:  0.37690 0.62310
#> Iter: 2 fn: 18.4205   Pars:  0.37690 0.62310
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 184.0532  Pars:  0.00000009628 0.99999990364
#> Iter: 2 fn: 184.0532  Pars:  0.00000005058 0.99999994942
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 356.5325  Pars:  0.18829 0.81171
#> Iter: 2 fn: 356.5325  Pars:  0.18829 0.81171
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 353.1351  Pars:  0.24568 0.75432
#> Iter: 2 fn: 353.1351  Pars:  0.24568 0.75432
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 180.6660  Pars:  0.00000004153 0.99999995880
#> Iter: 2 fn: 180.6660  Pars:  0.00000002203 0.99999997797
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 390.6584  Pars:  0.04254 0.95746
#> Iter: 2 fn: 390.6584  Pars:  0.04253 0.95747
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 345.8211  Pars:  0.03422 0.96578
#> Iter: 2 fn: 345.8211  Pars:  0.03422 0.96578
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 400.1339  Pars:  0.999997412 0.000002589
#> Iter: 2 fn: 400.1339  Pars:  0.9999993596 0.0000006404
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 347.2235  Pars:  0.000007085 0.999992915
#> Iter: 2 fn: 347.2235  Pars:  0.000001319 0.999998681
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 401.4868  Pars:  0.999997153 0.000002846
#> Iter: 2 fn: 401.4868  Pars:  0.999998259 0.000001741
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 403.9855  Pars:  0.75152 0.24848
#> Iter: 2 fn: 403.9855  Pars:  0.75154 0.24846
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 401.3947  Pars:  0.999984 0.000016
#> Iter: 2 fn: 401.3947  Pars:  0.999995323 0.000004677
#> Iter: 3 fn: 401.3947  Pars:  0.999998239 0.000001761
#> solnp--> Completed in 3 iterations
#> 
#> Iter: 1 fn: 350.1500  Pars:  0.999997608 0.000002392
#> Iter: 2 fn: 350.1500  Pars:  0.99999857 0.00000143
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 256.1493  Pars:  0.09862 0.90138
#> Iter: 2 fn: 256.1493  Pars:  0.09862 0.90138
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 176.8860  Pars:  0.0000000001619 1.0000000003752
#> Iter: 2 fn: 176.8860  Pars:  4.409e-11 1.000e+00
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 83.6283   Pars:  0.14539 0.85461
#> Iter: 2 fn: 83.6283   Pars:  0.14539 0.85461
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 6.8251    Pars:  0.008576 0.991424
#> Iter: 2 fn: 6.8251    Pars:  0.00852 0.99148
#> Iter: 3 fn: 6.8251    Pars:  0.00852 0.99148
#> solnp--> Completed in 3 iterations
#> 
#> Iter: 1 fn: 192.2133  Pars:  0.05664 0.94336
#> Iter: 2 fn: 192.2133  Pars:  0.05664 0.94336
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 396.6718  Pars:  0.18042 0.81958
#> Iter: 2 fn: 396.6718  Pars:  0.18042 0.81958
#> solnp--> Completed in 2 iterations
#> 
#> Iter: 1 fn: 387.5808  Pars:  0.08945 0.91055
#> Iter: 2 fn: 387.5808  Pars:  0.08945 0.91055
#> solnp--> Completed in 2 iterations
proc.time() - ptm
#>     user   system  elapsed 
#>   60.758    3.005 1236.890

## marginal effects
top_positive_effects <- sim_results$`Pos Shift Results by Rank`
top_negative_effects <- sim_results$`Neg Shift Results by Rank`

## interaction effects
pooled_synergy_effects <- sim_results$`Pooled Synergy Results by Rank`
pooled_antagonism_effects <- sim_results$`Pooled Antagonism Results by Rank`

k_fold_synergy_effects <- sim_results$`K Fold Synergy Results`
k_fold_antagonism_effects <- sim_results$`K Fold Antagonism Results`
top_positive_effects$`Rank 1` %>%
  kbl(caption = "Rank 1 Positive Stochastic Intervention Results") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Rank 1 Positive Stochastic Intervention Results
Condition Psi Variance SE Lower CI Upper CI P-value Fold Type Variables N Delta
X1 13.81500 0.4971182 0.7050661 12.4331 15.1969 0 1 Indiv Shift X1 167 1
X1 12.00894 0.7999055 0.8943744 10.2560 13.7619 0 2 Indiv Shift X1 167 1
X1 16.34802 0.5371240 0.7328874 14.9116 17.7844 0 3 Indiv Shift X1 166 1
Rank 1 14.98981 0.2786728 0.5278947 13.9552 16.0245 0 Pooled TMLE Indiv Shift Rank 1 500 1

Above we show the findings for the top rank positive marginal effect. Here we consistently find X1 which is true based on what is built into the DGP. The pooled estimate is pooling the findings for the top ranked positive result found across the folds which is all X1 in this case.

Next we look at the top negative result:

top_negative_effects$`Rank 2` %>%
  kbl(caption = "Rank 1 Negative Stochastic Intervention Results") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Rank 1 Negative Stochastic Intervention Results
Condition Psi Variance SE Lower CI Upper CI P-value Fold Type Variables N Delta
X5 -3.735474 0.6972921 0.8350402 -5.3721 -2.0988 7.7e-06 1 Indiv Shift X5 167 1
X5 -3.708447 0.6808466 0.8251343 -5.3257 -2.0912 7.0e-06 2 Indiv Shift X5 167 1
X5 -3.337799 0.4530761 0.6731093 -4.6571 -2.0185 7.0e-07 3 Indiv Shift X5 166 1
Rank 2 -3.502256 0.2273206 0.4767815 -4.4367 -2.5678 0.0e+00 Pooled TMLE Indiv Shift Rank 2 500 1

Here we consistently see X5 as having the strongest negative impact which is also true compared to the true DGP.

Next we will look at the top synergy results which is defined as the exposures that when shifted jointly have the highest, most positive, expected outcome difference compared to the sum of individual shifts of the same variables.

pooled_synergy_effects$`Rank 1` %>%
  kbl(caption = "Rank 1 Synergy Stochastic Intervention Results") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Rank 1 Synergy Stochastic Intervention Results
Rank Psi Variance SE Lower CI Upper CI P-value Fold N Delta Exposure 1 Delta Exposure 2 Type
Rank 1 -0.7658578 0.2417480 0.4916788 -1.7295 0.1978 0.2747394 Pooled TMLE 500 1 1 Var 1
Rank 1 -3.5759161 0.2393057 0.4891888 -4.5347 -2.6171 0.0000003 Pooled TMLE 500 1 1 Var 2
Rank 1 -3.9894164 0.2265181 0.4759392 -4.9222 -3.0566 0.0000000 Pooled TMLE 500 1 1 Joint
Rank 1 0.3523575 0.2508620 0.5008613 -0.6293 1.3340 0.6185685 Pooled TMLE 500 1 1 Interaction

Above this table shows the pooled results for the rank 1 synergy exposure interaction. Of course, the exposure sets in the interaction deemed to have the highest impact, synergy, may differ between the folds and thus this pooling may be over different exposure sets. Thus, the first line shows the pooled estimate for a shift in the first variable, the second line the second variable, third line the joint and fourth line the difference between the joint and sum of the first two lines, or the interaction effect. Therefore, in this case, we could be pooling over different variables because of inconcistency in what is included as rank 1 between the folds. Next we look at the k-fold specific results.

k_fold_synergy_effects$`Rank 1` %>%
  kbl(caption = "K-fold Synergy Stochastic Intervention Results") %>%
  kable_classic(full_width = F, html_font = "Cambria")
K-fold Synergy Stochastic Intervention Results
Rank Psi Variance SE Lower CI Upper CI P-value Fold N Delta Exposure 1 Delta Exposure 2 Type
1 -0.8485023 0.7309842 0.8549761 -2.5242 0.8272 0.3588033 1 167 1 1 X4
1 -3.8157757 0.7073221 0.8410244 -5.4642 -2.1674 0.0000317 1 167 1 1 X5
1 -3.8404800 0.6566876 0.8103627 -5.4288 -2.2522 0.0000199 1 167 1 1 X4-X5
1 0.8237980 0.9055973 0.9516288 -1.0414 2.6890 0.3984039 1 167 1 1 Interaction
1 -0.6330329 0.6895704 0.8304037 -2.2606 0.9945 0.4872591 2 167 1 1 X4
1 -3.7729863 0.7084949 0.8417214 -5.4227 -2.1232 0.0000391 2 167 1 1 X5
1 -4.2714349 0.6780636 0.8234462 -5.8854 -2.6575 0.0000025 2 167 1 1 X4-X5
1 0.1345843 0.7007521 0.8371094 -1.5061 1.7753 0.8830556 2 167 1 1 Interaction
1 -0.4549679 0.5524247 0.7432528 -1.9117 1.0018 0.5976862 3 166 1 1 X4
1 -3.3168676 0.4551012 0.6746119 -4.6391 -1.9947 0.0000538 3 166 1 1 X5
1 -3.9151131 0.4671708 0.6834989 -5.2547 -2.5755 0.0000022 3 166 1 1 X4-X5
1 -0.1432777 0.5245870 0.7242838 -1.5628 1.2763 0.8663046 3 166 1 1 Interaction

Here we see that the interaction between X4 and X5 was consistently found to have the highest synergistic interaction across the folds. Therefore, for our pooled parameter var 1 represents the pooled effects of shifting X4, var 2 represents the pooled effects of shifting X5, joint is X4 and X5 together and the interaction represents the interaction effect for these two variables.

Next we’ll look at the k-fold antagonistic interactions:

k_fold_antagonism_effects$`Rank 1` %>%
  kbl(caption = "K-fold Antagonistic Stochastic Intervention Results") %>%
  kable_classic(full_width = F, html_font = "Cambria")
K-fold Antagonistic Stochastic Intervention Results
Rank Psi Variance SE Lower CI Upper CI P-value Fold N Delta Exposure 1 Delta Exposure 2 Type
1 13.542234 0.4842421 0.6958751 12.1783 14.9061 0.0000000 1 167 1 1 X1
1 3.487427 0.7311449 0.8550701 1.8115 5.1633 0.0001623 1 167 1 1 X7
1 10.366036 0.3799426 0.6163949 9.1579 11.5741 0.0000000 1 167 1 1 X1-X7
1 -6.663625 0.6288383 0.7929932 -8.2179 -5.1094 0.0000000 1 167 1 1 Interaction
1 -1.245423 1.5739266 1.2545623 -3.7043 1.2135 0.2661755 2 167 1 1 X1
1 2.963629 0.8560357 0.9252220 1.1502 4.7770 0.0020626 2 167 1 1 X7
1 7.220341 0.5888704 0.7673789 5.7163 8.7244 0.0000000 2 167 1 1 X1-X7
1 5.502135 2.0114489 1.4182556 2.7224 8.2819 0.0000038 2 167 1 1 Interaction
1 16.279113 0.4910244 0.7007314 14.9057 17.6525 0.0000000 3 166 1 1 X1
1 2.878816 0.4632692 0.6806388 1.5448 4.2128 0.0004840 3 166 1 1 X7
1 7.522615 0.3101366 0.5568991 6.4311 8.6141 0.0000000 3 166 1 1 X1-X7
1 -11.635314 0.4261335 0.6527890 -12.9148 -10.3559 0.0000000 3 166 1 1 Interaction

Here, we see that in all the folds the X1-X7 has the strongest antagonistic relationship. X1-X7 was found in all the folds and therefore the oracle parameter is interpreted the same as we found in the synergy results. Which is here:

pooled_antagonism_effects$`Rank 1` %>%
  kbl(caption = "Rank 1 Antagonism Stochastic Intervention Results") %>%
  kable_classic(full_width = F, html_font = "Cambria")
Rank 1 Antagonism Stochastic Intervention Results
Rank Psi Variance SE Lower CI Upper CI P-value Fold N Delta Exposure 1 Delta Exposure 2 Type
Rank 1 13.215850 0.2422382 0.4921770 12.2512 14.1805 0.0e+00 Pooled TMLE 500 1 1 Var 1
Rank 1 3.253814 0.2683451 0.5180204 2.2385 4.2691 6.2e-06 Pooled TMLE 500 1 1 Var 2
Rank 1 9.138029 0.1578888 0.3973522 8.3592 9.9168 0.0e+00 Pooled TMLE 500 1 1 Joint
Rank 1 -7.331635 0.3119851 0.5585563 -8.4264 -6.2369 0.0e+00 Pooled TMLE 500 1 1 Interaction

So we see the interaction effect is negative, -7.33, and represents the pooled interaction effects across the folds which are all X1-X7.

Overall, this package provides implementation of estimation a non-parametric definition of interaction. We define positive values as synergy meaning the expected outcome under joint shift is much larger compared to individual addivitive effects. Likewise, we define antagonism as negative effects, the joint value being lower than the additive effects.

In this NIEHS data set w correctly identify the strongest individual effects in positive and negative directions and identify exposure relationships consistently for our definition of synergy and antagonism.


Issues

If you encounter any bugs or have any specific feature requests, please file an issue. Further details on filing issues are provided in our contribution guidelines.


Contributions

Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.


Citation

After using the InterXshift R package, please cite the following:


  • R/tmle3shift - An R package providing an independent implementation of the same core routines for the TML estimation procedure and statistical methodology as is made available here, through reliance on a unified interface for Targeted Learning provided by the tmle3 engine of the tlverse ecosystem.

  • R/medshift - An R package providing facilities to estimate the causal effect of stochastic treatment regimes in the mediation setting, including classical (IPW) and augmented double robust (one-step) estimators. This is an implementation of the methodology explored by Dı́az and Hejazi (2020).

  • R/haldensify - A minimal package for estimating the conditional density treatment mechanism component of this parameter based on using the highly adaptive lasso (Coyle, Hejazi, Phillips, et al. 2022; Hejazi, Coyle, and van der Laan 2020) in combination with a pooled hazard regression. This package implements a variant of the approach advocated by Dı́az and van der Laan (2011).


Funding

The development of this software was supported in part through NIH grant P42ES004705 from NIEHS


License

© 2020-2022 David B. McCoy

The contents of this repository are distributed under the MIT license. See below for details:

MIT License
Copyright (c) 2020-2022 David B. McCoy
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

References

Coyle, Jeremy R, Nima S Hejazi, Ivana Malenica, Rachael V Phillips, and Oleg Sofrygin. 2022. “sl3: Modern Machine Learning Pipelines for Super Learning.” https://doi.org/10.5281/zenodo.1342293.
Coyle, Jeremy R, Nima S Hejazi, Rachael V Phillips, Lars W van der Laan, and Mark J van der Laan. 2022. “hal9001: The Scalable Highly Adaptive Lasso.” https://doi.org/10.5281/zenodo.3558313.
Dı́az, Iván, and Nima S Hejazi. 2020. “Causal Mediation Analysis for Stochastic Interventions.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 82 (3): 661–83. https://doi.org/10.1111/rssb.12362.
Dı́az, Iván, and Mark J van der Laan. 2011. “Super Learner Based Conditional Density Estimation with Application to Marginal Structural Models.” The International Journal of Biostatistics 7 (1): 1–20.
———. 2012. “Population Intervention Causal Effects Based on Stochastic Interventions.” Biometrics 68 (2): 541–49.
Haneuse, Sebastian, and Andrea Rotnitzky. 2013. “Estimation of the Effect of Interventions That Modify the Received Treatment.” Statistics in Medicine 32 (30): 5260–77.
Hejazi, Nima S, Jeremy R Coyle, and Mark J van der Laan. 2020. “hal9001: Scalable Highly Adaptive Lasso Regression in R.” Journal of Open Source Software 5 (53): 2526. https://doi.org/10.21105/joss.02526.
McCoy, David B., Alan E. Hubbard, Alejandro Schuler, and Mark J. van der Laan. 2023. “Semi-Parametric Identification and Estimation of Interaction and Effect Modification in Mixed Exposures Using Stochastic Interventions.” https://arxiv.org/abs/2305.01849.