
Calculate Derived Traits from Tidy Data
derive_traits_.Rd
This function calculates derived traits from a tibble in tidy format.
Use this function if you are not using the glyexp
package.
For glycomics data, it calculates the derived traits directly.
For glycoproteomics data, each glycosite is treated as a separate glycome,
and derived traits are calculated in a site-specific manner.
Arguments
- tbl
A tibble in tidy format, with the following columns:
sample
: sample IDglycan_structure
: glycan structures, either aglyrepr::glycan_structure()
vector or a character vector of glycan structure strings supported byglyparse::auto_parse()
.value
: the quantification of the glycan in the sample.
For glycoproteomics data, additional columns are needed:
protein
: protein IDprotein_site
: the glycosite position on the protein The unique combination ofprotein
andprotein_site
determines a glycosite.
Other columns are ignored.
Please make sure that the data has been properly preprocessed, including normalization, missing value handling, etc. Specifically, for glycoproteomics data, please make sure that the data has been aggregated to the "glycoforms with structures" level. That is the quantification of each glycan structure on each glycosite in each sample.
- data_type
Either "glycomics" or "glycoproteomics".
- trait_fns
A named list of derived trait functions created by trait factories. Names of the list are the names of the derived traits. Default is
NULL
, which means all derived traits inbasic_traits()
are calculated.- mp_fns
A named list of meta-property functions. This parameter is useful if your trait functions use custom meta-properties other than those in
all_mp_fns()
. Default isNULL
, which means all meta-properties inall_mp_fns()
are used.
Value
A tidy tibble containing the following columns:
sample
: sample IDtrait
: derived trait namevalue
: the value of the derived trait
For glycoproteomics data, with additional columns:
protein
: protein IDprotein_site
: the glycosite position on the protein
Other columns in the original tibble are not included.
Examples
# Create example tidy data
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following object is masked from ‘package:glyexp’:
#>
#> select_var
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
library(glyexp)
library(tibble)
tidy_data <- as_tibble(real_experiment2)
# Calculate traits
traits <- derive_traits_(tidy_data, data_type = "glycomics")
traits
#> # A tibble: 2,016 × 3
#> trait sample value
#> <chr> <chr> <dbl>
#> 1 TM S1 0.0322
#> 2 TM S2 0.0274
#> 3 TM S3 0.0215
#> 4 TM S4 0.0178
#> 5 TM S5 0.0238
#> 6 TM S6 0.0254
#> 7 TM S7 0.0234
#> 8 TM S8 0.0200
#> 9 TM S9 0.0170
#> 10 TM S10 0.0207
#> # ℹ 2,006 more rows