Skip to contents

Reconstruct the biosynthetic pathway for one or more glycans using enzymatic reactions. This function uses a multi-target breadth-first search to find all feasible pathways that can synthesize all the target glycans.

Usage

rebuild_biosynthesis(glycans, enzymes = NULL, max_steps = 20, filter = NULL)

Arguments

glycans

A glyrepr::glycan_structure() vector, or a character vector of strings supported by glyparse::auto_parse(). Can also be a single glycan. If multiple glycans are provided, the starting structure will be decided by the first glycan. Therefore, please make sure glycans are not a mixed vector of N- and O-glycans.

enzymes

A character vector of gene symbols, or a list of enzyme() objects. If NULL (default), all available enzymes will be used.

max_steps

Integer, maximum number of enzymatic steps to search. Default is 20.

filter

Optional function to filter generated glycans at each step. Should take a glyrepr::glycan_structure() vector as input and return a logical vector of the same length. It will be applied to all the generated glycans at each BFS step for pruning.

Value

An igraph::igraph() object representing the synthesis path(s). Vertices represent glycan structures with name attribute containing IUPAC-condensed strings. Edges represent enzymatic reactions with enzyme attribute containing gene symbols and step attribute indicating the step number. For multiple targets, the graph includes all synthesis paths needed to reach every target glycan.

Details

For N-glycans, the starting structure is assumed to be "Glc(3)Man(9)GlcNAc(2)", the N-glycan precursor transfered to Asn by OST. For O-glycans, the starting structure is assumed to be "GalNAc(a1-".

Important notes

Here are some important notes for all functions in the glyenzy package.

Applicability

All algorithms and enzyme information in glyenzy are applicable only to humans, and specifically to N-glycans and O-GalNAc glycans. Results may be inaccurate for other types of glycans (e.g., GAGs, glycolipids) or for glycans in other species (e.g., plants, insects).

Inclusiveness

The algorithm takes an intentionally inclusive approach, assuming that all possible isoenzymes capable of catalyzing a given reaction may be involved. Therefore, results should be interpreted with caution.

For example, in humans, detection of the motif "Neu5Ac(a2-3)Gal(b1-" will return both "ST3GAL3" and "ST3GAL4". In reality, only one of them might be active, depending on factors such as tissue specificity.

Only "concrete" glycans

The function only works for glycans containing concrete residues (e.g., "Glc", "GalNAc"), and not for glycans with generic residues (e.g., "Hex", "HexNAc").

Substituents

Subtituents (e.g. sulfation, phosphorylation) is not supported yet, and the algorithms might fail for glycans with subtituents. If your glycans contains substituents, use glyrepr::remove_substituents() to get clean glycans.

Incomplete glycan structures

If the glycan structure is incomplete or partially degraded, the result may be misleading.

Starting points

Throughout glyenzy, the starting glycan is the Glc(3)Man(9)GlcNAc(2) precursor for N-glycans, and GalNAc(a1- for O-glycans. This means that enzymes involved in N-glycan precursor biosynthesis, mainly ALGs, and OST, which transfered the precursor to Asn, are not considered here. Similarly, GALNTs for O-glycans are not considered.

Examples

library(glyrepr)
library(glyparse)

# Rebuild the biosynthetic pathway of a single glycan
glycan <- "Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-"
path <- rebuild_biosynthesis(glycan, max_steps = 20)

# Rebuild pathways for multiple glycans
glycans <- c(
  "Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-",
  "Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-"
)
path <- rebuild_biosynthesis(glycans, max_steps = 20)

# View the path
igraph::as_data_frame(path, what = "edges")
#>                                                    from
#> 1                                            GalNAc(a1-
#> 2                                   Gal(b1-3)GalNAc(a1-
#> 3                       GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 4                       GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 5                       GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 6                       GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 7                       GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 8              Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 9              Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 10             Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 11             Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 12  Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 13  Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 14  Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 15 Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#> 16 Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-
#>                                                                 to  enzyme step
#> 1                                              Gal(b1-3)GalNAc(a1- C1GALT1    1
#> 2                                  GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-  B3GNT3    2
#> 3                         Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- B4GALT1    3
#> 4                         Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- B4GALT2    3
#> 5                         Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- B4GALT3    3
#> 6                         Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- B4GALT4    3
#> 7                         Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- B4GALT5    3
#> 8              Fuc(a1-3)[Gal(b1-4)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-    FUT3    4
#> 9             Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- ST3GAL3    4
#> 10            Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- ST3GAL4    4
#> 11            Neu5Ac(a2-3)Gal(b1-4)GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- ST3GAL6    4
#> 12 Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- ST3GAL3    5
#> 13 Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- ST3GAL4    5
#> 14 Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1- ST3GAL6    5
#> 15 Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-    FUT3    5
#> 16 Neu5Ac(a2-3)Gal(b1-4)[Fuc(a1-3)]GlcNAc(b1-3)Gal(b1-3)GalNAc(a1-    FUT7    5