Skip to contents

Extract key properties of N-glycans, including:

  • "glycan_type": N-glycan type: high mannose, hybrid, complex, or paucimannose.

  • "bisecting": Bisecting GlcNAc presence.

  • "antennae": Number of antennae.

  • "core_fuc": Number of core fucoses.

  • "arm_fuc": Number of arm fucoses.

  • "terminal_gal": Number of terminal galactoses.

Usage

describe_n_glycans(glycans, strict = FALSE, parallel = FALSE)

Arguments

glycans

A glyrepr_structure object, or a character vector of IUPAC-condensed structure strings.

strict

A logical value. If TRUE, the glycan must have "concrete" monosaccharides (e.g. "GlcNAc", "Man", "Gal") and linkage information. If FALSE, the function is more lenient, checking monosaacharide identities on the "generic" level (e.g. "Hex", "HexNAc") and ignoring linkage information. Default is FALSE. This is preferred because in most cases the structural resolution could not be high, but we known for sure the glycans are indeed N-glycans.

parallel

A logical value. If TRUE, the function will use parallel processing. Remember to call future::plan() before using this argument, otherwise the function will still use sequential processing.

Value

A tibble with the following columns: "glycan_type", "bisecting", "antennae", "core_fuc", "arm_fuc", "terminal_gal". If the input glycans have names, the tibble will have a "glycan" column. Otherwise, IUPAC-condensed strings will be used as the "glycan" column.

Details

This function is designed to work with N-glycans only. If the glycans are not N-glycans, an error is thrown.

Strictness

By default (strict = FALSE), the function is very lenient for motif checking. It only checks the monosaccharide types on the "generic" level (e.g. "Hex", "HexNAc"), and ignores linkage information. This is preferred because in most cases the structural resolution could not be high, e.g. in most glycoproteomics studies. However, the glycans are guaranteed to be N-glycans by the glycosylation sites. In this case, we could make some assumptions about the glycan structures, and extract the key properties. For example, an H-N terminal motif is considered a terminal galactose. If you have high-resolution glycan structures, you can set strict = TRUE.

Enabling parallel processing

This function can spend a lot of time on large datasets (e.g. > 500 glycans). To speed up, you can enable parallel processing by setting parallel = TRUE. However, changing the argument to TRUE only set the function "ready" for parallel processing. You still need to call future::plan() to change the parallel backend. For example, to use the "multisession" backend:

library(future)
old_plan <- future::plan("multisession")  # Save the old plan
describe_n_glycans(glycans, parallel = TRUE)
future::plan(old_plan)  # Restore the old plan

Examples

library(purrr)

glycans <- c(
  "Man(a1-3)[Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc(?1-",
  "Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc(?1-"
)
describe_n_glycans(glycans)
#> # A tibble: 2 × 7
#>   glycan_type  bisecting n_antennae n_core_fuc n_arm_fuc n_gal n_terminal_gal
#>   <chr>        <lgl>          <int>      <int>     <int> <int>          <int>
#> 1 paucimannose FALSE             NA          0         0     0              0
#> 2 complex      FALSE              1          0         0     0              0