Skip to contents

This function automatically selects and applies the most suitable imputation method for the given dataset. If Quality Control (QC) samples are present, the method that best stabilizes them (i.e., yields the lowest median coefficient of variation) is chosen. Otherwise, it defaults to a sample-size-based strategy:

  • less than 30 samples: Sample minimum imputation

  • between 30 and 100 samples: Minimum probability imputation

  • more than 100 samples: MissForest imputation

Usage

auto_impute(
  exp,
  group_col = "group",
  qc_name = "QC",
  to_try = NULL,
  info = NULL
)

Arguments

exp

An glyexp::experiment().

group_col

The column name in sample_info for groups. Default is "group". Can be NULL when no group information is available.

qc_name

The name of QC samples in the group_col column. Default is "QC". Only used when group_col is not NULL.

to_try

Imputation functions to try. A list. Default includes:

info

Internal parameter used by auto_clean().

Value

The imputed experiment.

Details

By default, all imputation methods are included for benchmarking when QC samples are available. Note that some methods (e.g., MissForest) may be slow for large datasets.

Examples

library(glyexp)
exp_imputed <- auto_impute(real_experiment)
#> No QC samples found. Using default imputation method based on sample size.
#> Sample size <= 30, using `impute_sample_min()`.