Skip to contents

A vector of glycan structures with efficient storage using hash-based deduplication.

Usage

glycan_structure(...)

is_glycan_structure(x)

Arguments

...

igraph graph objects to be converted to glycan structures, or existing glycan structure vectors.

x

An object to check or convert.

Value

A glycan structure vector object.

Details

The underlying implementation uses hash values of IUPAC codes of the glycan structures. This prevents redundant storage and computation, which is very useful for glycan structures.

Each glycan structure is a directed graph modeling of a glycan structure, where nodes are monosaccharides and edges are linkages.

S3 Classes

A glycan structure vector is a vctrs record with additional S3 class glyrepr_structure. Therefore, sloop::s3_class() of a glycan structure vector is c("glyrepr_structure", "vctrs_rcrd").

Constraints for individual structures:

  • The graph must be directed and an outward tree (reducing end as root).

  • The graph must have a graph attribute anomer, in the form of "a1".

  • The graph must have a graph attribute alditol, a logical value.

  • The graph must have a vertex attribute mono for monosaccharide names.

  • The graph must have a vertex attribute sub for substituents.

  • The graph must have an edge attribute linkage for linkages.

  • Monosaccharide name must be known, either generic (Hex, HexNAc, etc.) or concrete (Glc, Gal, etc.), but not a mixture of both. NA is not allowed.

  • Substituent must be valid. For single substituents, use the form "xY", where x is position and Y is substituent name, e.g. "2Ac", "3S", etc. For multiple substituents, separate them with commas and order by position, e.g. "3Me,4Ac", "2S,6P", etc. Empty string "" means no substituents.

  • Linkages must be valid, in the form of "a/bX-Y", where X and Y are integers, e.g. "b1-4", "a2-3", etc. NA is not allowed.

Examples

library(igraph)

# A simple glycan structure: GlcNAc(b1-4)GlcNAc
graph <- make_graph(~ 1-+2)  # 1 and 2 are monosaccharides
V(graph)$mono <- c("GlcNAc", "GlcNAc")
V(graph)$sub <- ""
E(graph)$linkage <- "b1-4"
graph$anomer <- "a1"
graph$alditol <- FALSE

# Create glycan structure vector
glycan_structure(graph)
#> <glycan_structure[1]>
#> [1] GlcNAc(b1-4)GlcNAc(a1-
#> # Unique structures: 1

# Create vector with multiple structures
glycan_structure(graph, o_glycan_core_1())
#> <glycan_structure[2]>
#> [1] GlcNAc(b1-4)GlcNAc(a1-
#> [2] Gal(b1-3)GalNAc(a1-
#> # Unique structures: 2