
Match Motif(s) in Glycans
match_motif.Rd
These functions find all occurrences of the given motif
(s) in the glycans
.
Node-to-node mapping is returned for each match.
This function is NOT useful for most users if you are not interested in the concrete node mapping.
See have_motif()
and count_motif()
for more information about the matching rules.
match_motif()
matches a single motif against multiple glycansmatch_motifs()
matches multiple motifs against multiple glycans
Different from have_motif()
and count_motif()
,
these functions return detailed match information.
More specifically, for each glycan-motif pair,
a integer vector is returned,
indicating the node mapping from the motif to the glycan.
For example, if the vector is c(2, 3, 6)
,
it means that the first node in the motif matches the 2nd node in the glycan,
the second node in the motif matches the 3rd node in the glycan,
and the third node in the motif matches the 6th node in the glycan.
Node indices are only meaningful for glyrepr::glycan_structure()
,
so only glyrepr::glycan_structure()
is supported for glycans
and motifs
.
Usage
match_motif(glycans, motif, alignment = NULL, ignore_linkages = FALSE)
match_motifs(glycans, motifs, alignments = NULL, ignore_linkages = FALSE)
Arguments
- glycans
A
glyrepr_structure
object.- motif
A
glyrepr_structure
object with length 1.- alignment
A character string. Possible values are "substructure", "core", "terminal" and "whole". If not provided, the value will be decided based on the
motif
argument. Ifmotif
is a motif name, the alignment in the database will be used. Otherwise, "substructure" will be used.- ignore_linkages
A logical value. If
TRUE
, linkages will be ignored in the comparison.- motifs
A
glyrepr_structure
object.- alignments
A character vector specifying alignment types for each motif. Can be a single value (applied to all motifs) or a vector of the same length as motifs.
Value
A nested list of integer vectors.
match_motif()
: Two levels of nesting. The outer list corresponds to glycans, and the inner list corresponds to matches. Usepurrr::pluck(result, glycan_index, match_index)
to access the match information. For example,purrr::pluck(result, 1, 2)
means the 2nd match in the 1st glycan.match_motifs()
: Three levels of nesting. The outermost list corresponds to motifs, the middle list corresponds to glycans, and the innermost list corresponds to matches. Usepurrr::pluck(result, motif_index, glycan_index, match_index)
to access the match information. For example,purrr::pluck(result, 1, 2, 3)
means the 3rd match in the 2nd glycan for the 1st motif.
Examples
library(glyparse)
library(glyrepr)
(glycan <- n_glycan_core())
#> <glycan_structure[1]>
#> [1] Man(a1-3)[Man(a1-6)]Man(b1-4)GlcNAc(b1-4)GlcNAc(b1-
#> # Unique structures: 1
# Let's peek under the hood of the nodes in the glycan
glycan_graph <- get_structure_graphs(glycan)
igraph::V(glycan_graph)$mono # 1, 2, 3, 4, 5
#> [1] "GlcNAc" "GlcNAc" "Man" "Man" "Man"
# Match a single motif against a single glycan
motif <- parse_iupac_condensed("Man(a1-3)[Man(a1-6)]Man(b1-")
match_motif(glycan, motif)
#> [[1]]
#> [[1]][[1]]
#> [1] 3 5 4
#>
#>
# Match multiple motifs against a single glycan
motifs <- c(
"Man(a1-3)[Man(a1-6)]Man(b1-",
"Man(a1-3)Man(b1-4)GlcNAc(b1-4)GlcNAc(?1-"
)
motifs <- parse_iupac_condensed(motifs)
match_motifs(glycan, motifs)
#> [[1]]
#> [[1]][[1]]
#> [[1]][[1]][[1]]
#> [1] 3 5 4
#>
#>
#>
#> [[2]]
#> [[2]][[1]]
#> [[2]][[1]][[1]]
#> [1] 1 2 3 4
#>
#>
#>