Skip to contents

These functions find all occurrences of the given motif(s) in the glycans. Node-to-node mapping is returned for each match. This function is NOT useful for most users if you are not interested in the concrete node mapping. See have_motif() and count_motif() for more information about the matching rules.

  • match_motif() matches a single motif against multiple glycans

  • match_motifs() matches multiple motifs against multiple glycans

Different from have_motif() and count_motif(), these functions return detailed match information. More specifically, for each glycan-motif pair, a integer vector is returned, indicating the node mapping from the motif to the glycan. For example, if the vector is c(2, 3, 6), it means that the first node in the motif matches the 2nd node in the glycan, the second node in the motif matches the 3rd node in the glycan, and the third node in the motif matches the 6th node in the glycan.

Node indices are only meaningful for glyrepr::glycan_structure(), so only glyrepr::glycan_structure() is supported for glycans and motifs.

Usage

match_motif(glycans, motif, alignment = NULL, ignore_linkages = FALSE)

match_motifs(glycans, motifs, alignments = NULL, ignore_linkages = FALSE)

Arguments

glycans

A glyrepr_structure object.

motif

A glyrepr_structure object with length 1.

alignment

A character string. Possible values are "substructure", "core", "terminal" and "whole". If not provided, the value will be decided based on the motif argument. If motif is a motif name, the alignment in the database will be used. Otherwise, "substructure" will be used.

ignore_linkages

A logical value. If TRUE, linkages will be ignored in the comparison.

motifs

A glyrepr_structure object.

alignments

A character vector specifying alignment types for each motif. Can be a single value (applied to all motifs) or a vector of the same length as motifs.

Value

A nested list of integer vectors.

  • match_motif(): Two levels of nesting. The outer list corresponds to glycans, and the inner list corresponds to matches. Use purrr::pluck(result, glycan_index, match_index) to access the match information. For example, purrr::pluck(result, 1, 2) means the 2nd match in the 1st glycan.

  • match_motifs(): Three levels of nesting. The outermost list corresponds to motifs, the middle list corresponds to glycans, and the innermost list corresponds to matches. Use purrr::pluck(result, motif_index, glycan_index, match_index) to access the match information. For example, purrr::pluck(result, 1, 2, 3) means the 3rd match in the 2nd glycan for the 1st motif.

Vertex and Linkage Indices

The indices of vertices and linkages in a glycan correspond directly to their order in the IUPAC-condensed string, which is printed when you print a glyrepr::glycan_structure(). For example, for the glycan Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc(b1-), the vertices are "Man", "Man", "Man", "GlcNAc", "GlcNAc", and the linkages are "a1-3", "a1-6", "b1-4", "b1-4".

Thus, matching the motif "Man(a1-3)Man(b1-4)" to this glycan yields c(1, 3). This indicates that the first motif vertex (the a1-3 Man) corresponds to the first vertex in the glycan, and the second motif vertex (the b1-4 Man) corresponds to the third vertex in the glycan.

Examples

library(glyparse)
library(glyrepr)

(glycan <- n_glycan_core())
#> <glycan_structure[1]>
#> [1] Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc(b1-
#> # Unique structures: 1

# Let's peek under the hood of the nodes in the glycan
glycan_graph <- get_structure_graphs(glycan)
igraph::V(glycan_graph)$mono  # 1, 2, 3, 4, 5
#> [1] "Man"    "Man"    "Man"    "GlcNAc" "GlcNAc"

# Match a single motif against a single glycan
motif <- parse_iupac_condensed("Man(a1-3)[Man(a1-6)]Man(b1-")
match_motif(glycan, motif)
#> [[1]]
#> [[1]][[1]]
#> [1] 1 2 3
#> 
#> 

# Match multiple motifs against a single glycan
motifs <- c(
  "Man(a1-3)[Man(a1-6)]Man(b1-",
  "Man(a1-3)Man(b1-4)GlcNAc(b1-4)GlcNAc(?1-"
)
motifs <- parse_iupac_condensed(motifs)
match_motifs(glycan, motifs)
#> [[1]]
#> [[1]][[1]]
#> [[1]][[1]][[1]]
#> [1] 1 2 3
#> 
#> 
#> 
#> [[2]]
#> [[2]][[1]]
#> [[2]][[1]][[1]]
#> [1] 1 3 4 5
#> 
#> 
#>