
Quantify motifs in an experiment
quantify_motifs.Rd
This function quantifies motifs from glycomic or glycoproteomic profiles. For glycomics data, it calculates the motif quantifications directly. For glycoproteomics data, each glycosite is treated as a separate glycome, and motif quantifications are calculated in a site-specific manner.
The function takes a glyexp::experiment()
and returns a new glyexp::experiment()
with motif quantifications. Instead of containing the quantification of each glycan
on each glycosite in each sample, the new experiment contains the quantification
of each motif on each glycosite in each sample (for glycoproteomics data) or
the motif quantification in each sample (for glycomics data).
Due to the unified data structure of glyexp::experiment()
,
the returned motif experiment can be passed to downstream glycoverse
functions like glystats::gly_ttest()
for further statistical analysis.
Also, you can use as_tibble()
to convert it to a "tidy" tibble for custom analysis.
Arguments
- exp
A
glyexp::experiment()
object. Before using this function, you should preprocess the data using theglyclean
package. For glycoproteomics data, the data should be aggregated to the "gfs" (glycoforms with structures) level usingglyclean::aggregate()
. Also, please make sure that theglycan_structure
column is present in thevar_info
table, as not all glycoproteomics identification softwares provide this information. The column can be aglyrepr::glycan_structure()
vector, or a character vector of glycan structure strings supported byglyparse::auto_parse()
.For glycoproteomics data, the
var_info
table must contain:protein
: protein IDprotein_site
: the glycosite position on the protein The unique combination ofprotein
andprotein_site
determines a glycosite.
- motifs
A character vector of motif names, glycan structure strings, or a 'glyrepr_structure' object. For glycan structure strings, all formats supported by
glyparse::auto_parse()
are accepted, including IUPAC-condensed, WURCS, GlycoCT, and others.- alignments
A character vector specifying alignment types for each motif. Can be a single value (applied to all motifs) or a vector of the same length as motifs.
- ignore_linkages
A logical value. If
TRUE
, linkages will be ignored in the comparison.
Value
A new glyexp::experiment()
object for motif quantifications.
The new experiment contains the following columns in the var_info
table:
variable
: variable IDmotif
: motif name
For glycoproteomics data, with additional columns:
protein
: protein IDprotein_site
: the glycosite position on the protein
Other columns in the var_info
table (e.g. gene
) are retained if they have "many-to-one"
relationship with glycosites (unique combinations of protein
, protein_site
).
That is, each glycosite cannot have multiple values for these columns.
gene
is a common example, as a glycosite can only be associate with one gene.
Descriptions about glycans are not such a column, as a glycosite can have multiple glycans,
thus having multiple descriptions.
Columns not having this relationship with glycosites will be dropped.
Don't worry if you cannot understand this logic,
as long as you know that this function will try its best to preserve useful information.
sample_info
and meta_data
are not modified,
except that the exp_type
field of meta_data
is set to "traitomics" for glycomics data,
and "traitproteomics" for glycoproteomics data.