
Add Motif Annotations to an Experiment
add_motifs_int.Rd
This function adds motif annotations to the variable information
of a glyexp::experiment()
.
add_motifs_int()
adds integer annotations (how many motifs are present).
add_motifs_lgl()
adds boolean annotations (whether the motif is present).
Usage
add_motifs_int(exp, motifs, alignments = NULL, ignore_linkages = FALSE)
add_motifs_lgl(exp, motifs, alignments = NULL, ignore_linkages = FALSE)
Arguments
- exp
An
glyexp::experiment()
object.- motifs
A character vector of motif names, IUPAC-condensed structure strings, or a 'glyrepr_structure' object.
- alignments
A character vector specifying alignment types for each motif. Can be a single value (applied to all motifs) or a vector of the same length as motifs.
- ignore_linkages
A logical value. If
TRUE
, linkages will be ignored in the comparison.
Value
An glyexp::experiment()
object with motif annotations added to the variable information.
About Names
The naming rule for the new columns is similar to that of have_motifs()
.
Briefly, you can use named character vector to name the motifs,
and that will be used as the new column names.
The only catchup is that you cannot pass a named glyrepr::glycan_structure()
to motifs
.
This is a fundamental limitation of the vctrs_rcrd
class,
which glyrepr::glycan_structure()
is built on.
Why do we need these functions
Adding one motif annotation to a glyexp::experiment()
is easy:
exp |>
mutate_var(has_hex = have_motif(glycan_structure, "Hex"))
However, adding multiple motifs is not as straightforward.
You can still use mutate_var()
to add multiple motifs like this:
exp |>
mutate_var(
n_hex = count_motif(glycan_structure, "Hex"),
n_dhex = count_motif(glycan_structure, "dHex"),
n_hexnac = count_motif(glycan_structure, "HexNAc"),
)
This method has two problems:
it has a lot of boilerplate code (a lot of typing)
it is not very efficient, as each call to
count_motif
performs validation and conversion onglycan_structure
, which is a time-consuming process.
Advanced R users might want to use count_motifs()
(the plural cousin of count_motif()
) with !!!
:
exp |>
mutate_var(!!!count_motifs(glycan_structure, c("Hex", "dHex", "HexNAc")))
Sadly, this doesn't work.
Firstly, count_motifs
returns a matrix, not a list.
Secondly, even if you use as.data.frame()
to convert it to a list,
!!!
triggers early evaluation of glycan_structure
in the calling environment,
before passing it to count_motifs()
.
This will raise an "object not found" error,
and there is no easy way to fix this, at least for now.
Therefore, we think it would be better to have a function that adds multiple motif annotations in a single call, in a more intuitive way. That's why we provide these two functions.
Under the hood, they use a more straightforward approach:
get the motif annotation matrix using
count_motifs()
orhave_motifs()
convert the matrix to a tibble
use
dplyr::bind_cols()
to add the tibble to the variable information