This guide explains the learner options exposed through
formatArguments(). There are two separate nuisance-learning
tasks:
- treatment assignment models, supplied through the
SuperLearnerpackage - censoring and event hazard models, supplied as a candidate hazard library
The hazard library currently works as a cross-validated discrete
selector: for each censoring or event type, concrete
evaluates the candidate hazard learners and uses the learner with the
lowest validation loss.
The code snippets below assume a trial
data.table as built in the Trialist quickstart.
Conservative first library
For first use in a trial, start with simple and stable learners.
Model <- list(
arm = c("SL.mean", "SL.glm"),
"0" = list(Censor = survival::Surv(time, event == 0) ~ arm + age + sex),
"1" = list(Event = survival::Surv(time, event == 1) ~ arm + age + sex)
)For competing risks, add one hazard model list for each positive event code.
Treatment Super Learner
For randomized trials, the treatment model is often simple. If treatment was randomized 1:1 and the analysis is intent-to-treat, a simple library is a good starting point:
For observational or covariate-adaptive settings, add flexible
learners already available through SuperLearner.
Only include treatment learners whose packages are installed and appropriate for the sample size.
Hazard learner aliases
Hazard learners are specified inside Model[["0"]],
Model[["1"]], and other event-specific entries.
| Alias | Learner | Package |
|---|---|---|
| Cox formula | Cox proportional hazards | survival |
"coxnet" |
Penalized Cox model | glmnet |
"rsf" or "randomForestSRC"
|
Random survival forest | randomForestSRC |
"aareg" or "additive_hazards"
|
Additive hazards | survival |
"hal" or "hal9001"
|
HAL pooled discrete-time hazard | hal9001 |
Optional packages:
install.packages(c("glmnet", "randomForestSRC", "hal9001"))To test optional hazard learners on a small built-in example before trying your own data:
Sys.setenv(CONCRETE_RUN_OPTIONAL_LEARNERS = "true")
source(system.file("examples", "trialist-smoke-test.R", package = "concrete"))When optional learners are installed, the smoke test prints a summary like this:
| analysis | status | elapsed_sec | converged | step | max_ratio | failing_components |
|---|---|---|---|---|---|---|
| cox_only | ok | 1.2 | TRUE | 4 | 0.743 | 0 |
| additive_hazards | ok | 1.2 | TRUE | 4 | 0.899 | 0 |
| coxnet | ok | 2.3 | TRUE | 13 | 0.838 | 0 |
| rsf | ok | 1.1 | TRUE | 4 | 0.748 | 0 |
| hal | ok | 1.5 | TRUE | 4 | 0.747 | 0 |
This table is not a benchmark. It is a quick installation and learner-path check. On real trial data, compare estimates, convergence diagnostics, and runtime across the learner ladder.
Cox plus machine-learning hazards
This example gives each event type a small candidate library. The selected hazard learner can differ across censoring, the event of interest, and competing events.
Model <- list(
arm = c("SL.mean", "SL.glm", "SL.glmnet"),
"0" = list(
Cox = survival::Surv(time, event == 0) ~ arm + age + sex + albumin,
Coxnet = "coxnet",
Aalen = "aareg"
),
"1" = list(
Cox = survival::Surv(time, event == 1) ~ arm + age + sex + albumin,
RSF = "rsf",
HAL = "hal"
),
"2" = list(
Cox = survival::Surv(time, event == 2) ~ arm + age + sex + albumin,
RSF = "rsf"
)
)
ConcreteArgs <- formatArguments(
DataTable = trial,
EventTime = "time",
EventType = "event",
Treatment = "arm",
ID = "id",
Intervention = makeITT(),
TargetTime = c(365, 730),
TargetEvent = 1,
CVArg = list(V = 5),
Model = Model,
UpdateMethod = "adaptive",
EICStopRule = "absolute",
EICStopAbsTol = 0.02 / sqrt(nrow(trial)),
Verbose = FALSE
)
ConcreteEst <- doConcrete(ConcreteArgs)Suggested libraries by trial size
These are starting points, not rules.
| Setting | Suggested treatment library | Suggested hazard library |
|---|---|---|
| Small trial or rare event |
SL.mean, SL.glm
|
Cox formulas, additive hazards |
| Moderate trial |
SL.mean, SL.glm,
SL.glmnet
|
Cox, Coxnet, additive hazards |
| Larger trial with nonlinear risk | add tree/boosting learners | Cox, Coxnet, RSF, HAL |
| First convergence debugging |
SL.mean, SL.glm
|
Cox only |
Inspect selected hazard learners
When ReturnModels = TRUE, fitted objects keep initial
learner information.
fits <- attr(ConcreteEst, "InitFits")
# Treatment model Super Learner weights.
fits[["arm"]]
# Hazard model selection risks are stored on each fitted hazard object.
lapply(fits[setdiff(names(fits), "arm")], function(fit) {
attr(fit, "HazSL")
})The hazard learner output records cross-validated risks and the selected candidate for each event type. A simplified example looks like:
$`1`
$`1`$SupLrnCVRisks
Cox Coxnet RSF HAL
118.20 116.75 117.40 119.05
$`1`$SLCoef
Cox Coxnet RSF HAL
0 1 0 0
Here, Coxnet was selected for event type 1.
The exact object names depend on the names you used in
Model.
Practical testing sequence
For a trial testing week, use the same estimand and data set across a ladder of learner libraries.
model_cox <- list(
arm = c("SL.mean", "SL.glm"),
"0" = list(Cox = survival::Surv(time, event == 0) ~ .),
"1" = list(Cox = survival::Surv(time, event == 1) ~ .)
)
model_penalized <- list(
arm = c("SL.mean", "SL.glm", "SL.glmnet"),
"0" = list(Cox = survival::Surv(time, event == 0) ~ ., Coxnet = "coxnet"),
"1" = list(Cox = survival::Surv(time, event == 1) ~ ., Coxnet = "coxnet")
)
model_flexible <- list(
arm = c("SL.mean", "SL.glm", "SL.glmnet"),
"0" = list(Cox = survival::Surv(time, event == 0) ~ ., Coxnet = "coxnet"),
"1" = list(
Cox = survival::Surv(time, event == 1) ~ .,
Coxnet = "coxnet",
RSF = "rsf",
Aalen = "aareg",
HAL = "hal"
)
)Compare:
- point estimates and confidence intervals
- selected hazard learners
- convergence diagnostics
- runtime
- whether the result is stable to removing the most flexible learners
