Skip to contents

This function quantifies motifs from glycomic or glycoproteomic profiles. For glycomics data, it calculates the motif quantifications directly. For glycoproteomics data, each glycosite is treated as a separate glycome, and motif quantifications are calculated in a site-specific manner.

The function takes a glyexp::experiment() and returns a new glyexp::experiment() with motif quantifications. Instead of containing the quantification of each glycan on each glycosite in each sample, the new experiment contains the quantification of each motif on each glycosite in each sample (for glycoproteomics data) or the motif quantification in each sample (for glycomics data).

Due to the unified data structure of glyexp::experiment(), the returned motif experiment can be passed to downstream glycoverse functions like glystats::gly_ttest() for further statistical analysis. Also, you can use as_tibble() to convert it to a "tidy" tibble for custom analysis.

Usage

quantify_motifs(exp, motifs, alignments = NULL, ignore_linkages = FALSE)

Arguments

exp

A glyexp::experiment() object. Before using this function, you should preprocess the data using the glyclean package. For glycoproteomics data, the data should be aggregated to the "gfs" (glycoforms with structures) level using glyclean::aggregate(). Also, please make sure that the glycan_structure column is present in the var_info table, as not all glycoproteomics identification softwares provide this information. The column can be a glyrepr::glycan_structure() vector, or a character vector of glycan structure strings supported by glyparse::auto_parse().

For glycoproteomics data, the var_info table must contain:

  • protein: protein ID

  • protein_site: the glycosite position on the protein The unique combination of protein and protein_site determines a glycosite.

motifs

A character vector of motif names, glycan structure strings, or a 'glyrepr_structure' object. For glycan structure strings, all formats supported by glyparse::auto_parse() are accepted, including IUPAC-condensed, WURCS, GlycoCT, and others.

alignments

A character vector specifying alignment types for each motif. Can be a single value (applied to all motifs) or a vector of the same length as motifs.

ignore_linkages

A logical value. If TRUE, linkages will be ignored in the comparison.

Value

A new glyexp::experiment() object for motif quantifications. The new experiment contains the following columns in the var_info table:

  • variable: variable ID

  • motif: motif name

For glycoproteomics data, with additional columns:

  • protein: protein ID

  • protein_site: the glycosite position on the protein

Other columns in the var_info table (e.g. gene) are retained if they have "many-to-one" relationship with glycosites (unique combinations of protein, protein_site). That is, each glycosite cannot have multiple values for these columns. gene is a common example, as a glycosite can only be associate with one gene. Descriptions about glycans are not such a column, as a glycosite can have multiple glycans, thus having multiple descriptions. Columns not having this relationship with glycosites will be dropped. Don't worry if you cannot understand this logic, as long as you know that this function will try its best to preserve useful information.

sample_info and meta_data are not modified, except that the exp_type field of meta_data is set to "traitomics" for glycomics data, and "traitproteomics" for glycoproteomics data.

See also