This guide provides comprehensive parameter documentation for all E2E functions.
E2E includes example datasets for both diagnostic and prognostic modeling:
Trains base classification models for diagnostic tasks. Parameters:
data (required): Data frame with sample names
(column 1), outcomes 0/1 (column 2), features (columns 3+)
model (required): Character vector of model names or
“all_dia” for all models
tune: Logical (default FALSE). Whether to perform
hyperparameter tuning
threshold_choices: Threshold selection method
seed: Integer (default 123). Random seed for
reproducibility
Bootstrap aggregating ensemble method. Parameters:
data (required): Training data frame
base_model_name (required): Base model name (e.g.,
“xb”, “rf”)
n_estimators: Integer (default 50). Number of base
models
subset_fraction: Numeric (default 0.632). Bootstrap
sampling fraction
tune_base_model: Logical (default FALSE). Tune base
models
threshold_choices: Same as models_dia()
seed: Integer (default 123). Random seed
Voting ensemble combining multiple models. Parameters:
results_all_models (required): Output from
models_dia()
data (required): Training data
type: Voting type
weight_metric: String (default “AUROC”). Metric for
soft voting weights
top: Integer (default 5). Number of top models to
use
threshold_choices: Same as models_dia()
seed: Integer (default 123). Random seed
Stacking ensemble with meta-model. Parameters:
results_all_models (required): Output from
models_dia()
data (required): Training data
meta_model_name (required): Meta-model name (e.g.,
“lasso”, “gbm”)
top: Integer (default 5). Number of top base
models
tune_meta: Logical (default FALSE). Tune
meta-model
threshold_choices: Same as models_dia()
seed: Integer (default 123). Random seed
Handles imbalanced datasets using EasyEnsemble-like algorithm. Parameters:
data (required): Imbalanced training data
base_model_name (required): Base model for balanced
subsets
n_estimators: Integer (default 10). Number of
balanced subsets
tune_base_model: Logical (default FALSE). Tune base
models
threshold_choices: Same as models_dia()
seed: Integer (default 123). Random seed
Applies trained model to new data. Parameters:
trained_model_object (required): Trained model
object from E2E functions
new_data (required): New data for prediction (sample
IDs in column 1)
label_col_name: String (default NULL). True label
column name if available
Evaluates model predictions. Parameters:
prediction_df (required): Prediction data frame from
apply_dia()
threshold_choices: Same as models_dia()
Trains base survival models. Parameters:
data (required): Data frame with sample ID, survival
status, time, features
model (required): Model names or “all_pro” for all
models
tune: Logical (default FALSE). Hyperparameter
tuning
time_unit: String (default “day”). Time unit (“day”,
“month”, “year”)
years_to_evaluate: Numeric vector (default
c(1,3,5)). Time points for time-dependent AUROC
seed: Integer (default 789). Random seed
Stacking ensemble for survival analysis. Parameters:
results_all_models (required): Output from
models_pro()
data (required): Training data
meta_model_name (required): Meta-model name
top: Integer (default 3). Number of top base
models
tune_meta: Logical (default FALSE). Tune
meta-model
time_unit: String (default “day”). Time
unit
years_to_evaluate: Numeric vector (default
c(1,3,5)). Evaluation time points
seed: Integer (default 789). Random seed
Bootstrap aggregating for survival analysis. Parameters:
data (required): Training data
base_model_name (required): Base model name
n_estimators: Integer (default 10). Number of base
models
subset_fraction: Numeric (default 0.632). Bootstrap
sampling fraction
tune_base_model: Logical (default FALSE). Tune base
models
time_unit: String (default “day”). Time
unit
years_to_evaluate: Numeric vector (default
c(1,3,5)). Evaluation time points
seed: Integer (default 456). Random seed
Applies trained survival model to new data. Parameters:
trained_model_object (required): Trained model
object
new_data (required): New data with same structure as
training data
time_unit: String (default “day”). Time
unit
Evaluates survival model predictions. Parameters:
prediction_df (required): Prediction data frame from
apply_pro()
years_to_evaluate: Numeric vector (default
c(1,3,5)). Evaluation time points
Creates diagnostic model evaluation plots. Parameters:
type (required): Plot type
data (required): Model results objectCreates prognostic model evaluation plots. Parameters:
type (required): Plot type
data (required): Model results object
time_unit: String (default “days”). Time unit for
axis labels
Creates SHAP interpretation plots. Parameters:
data (required): Model results with sample_score
data frame
raw_data (required): Original feature data
target_type (required): Data type
Registers custom algorithms.
Usage: 1. Define custom function following E2E
conventions 2. Register with
register_model_dia("model_name", custom_function) 3. Use
registered model in E2E workflows