
Describe N-Glycans Properties
describe_n_glycans.Rd
Extract key properties of N-glycans, including:
"glycan_type": N-glycan type: high mannose, hybrid, complex, or paucimannose.
"bisecting": Bisecting GlcNAc presence.
"antennae": Number of antennae.
"core_fuc": Number of core fucoses.
"arm_fuc": Number of arm fucoses.
"terminal_gal": Number of terminal galactoses.
Arguments
- glycans
A
glyrepr_structure
object, or a character vector of IUPAC-condensed structure strings.- strict
A logical value. If
TRUE
, the glycan must have "concrete" monosaccharides (e.g. "GlcNAc", "Man", "Gal") and linkage information. IfFALSE
, the function is more lenient, checking monosaacharide identities on the "generic" level (e.g. "Hex", "HexNAc") and ignoring linkage information. Default isFALSE
. This is preferred because in most cases the structural resolution could not be high, but we known for sure the glycans are indeed N-glycans.- parallel
A logical value. If
TRUE
, the function will use parallel processing. Remember to callfuture::plan()
before using this argument, otherwise the function will still use sequential processing.
Value
A tibble with the following columns: "glycan_type", "bisecting", "antennae", "core_fuc", "arm_fuc", "terminal_gal". If the input glycans have names, the tibble will have a "glycan" column. Otherwise, IUPAC-condensed strings will be used as the "glycan" column.
Details
This function is designed to work with N-glycans only. If the glycans are not N-glycans, an error is thrown.
Strictness
By default (strict = FALSE
), the function is very lenient for motif checking.
It only checks the monosaccharide types on the "generic" level (e.g. "Hex", "HexNAc"),
and ignores linkage information.
This is preferred because in most cases the structural resolution could not be high,
e.g. in most glycoproteomics studies.
However, the glycans are guaranteed to be N-glycans by the glycosylation sites.
In this case, we could make some assumptions about the glycan structures,
and extract the key properties.
For example, an H-N
terminal motif is considered a terminal galactose.
If you have high-resolution glycan structures, you can set strict = TRUE
.
Enabling parallel processing
This function can spend a lot of time on large datasets (e.g. > 500 glycans).
To speed up, you can enable parallel processing by setting parallel = TRUE
.
However, changing the argument to TRUE
only set the function "ready"
for parallel processing.
You still need to call future::plan()
to change the parallel backend.
For example, to use the "multisession" backend:
library(future)
old_plan <- future::plan("multisession") # Save the old plan
describe_n_glycans(glycans, parallel = TRUE)
future::plan(old_plan) # Restore the old plan
Examples
library(purrr)
glycans <- c(
"Man(a1-3)[Man(a1-3)Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc(?1-",
"Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc(?1-"
)
describe_n_glycans(glycans)
#> # A tibble: 2 × 7
#> glycan_type bisecting n_antennae n_core_fuc n_arm_fuc n_gal n_terminal_gal
#> <chr> <lgl> <int> <int> <int> <int> <int>
#> 1 paucimannose FALSE NA 0 0 0 0
#> 2 complex FALSE 1 0 0 0 0