
Create a new experiment
experiment.Rd
The data container of a glycoproteomics or glycomics experiment.
Expression matrix, sample information, and variable information
are required then will be managed by the experiment object.
It acts as the data core of the glycoverse
ecosystem.
The glyexp
package provides a set of functions to create,
manipulate, and analyze experiment()
objects in a tidyverse style.
Arguments
- expr_mat
An expression matrix with samples as columns and variables as rows.
- sample_info
A tibble with a column named "sample", and other columns other useful information about samples, e.g. group, batch, sex, age, etc.
- var_info
A tibble with a column named "variable", and other columns other useful information about variables, e.g. protein name, peptide, glycan composition, etc.
- exp_type
The type of the experiment, "glycomics" or "glycoproteomics".
- glycan_type
The type of glycan, "N" or "O".
- ...
Other meta data about the experiment.
Requirements of the input data
Expression matrix:
Must be a numeric matrix with variables as rows and samples as columns.
The column names must correspond to sample IDs.
The row names must correspond to variable IDs.
Sample information (sample_info
):
Must be a tibble with a column named "sample" (sample ID).
Each value in "sample" must be unique.
The set of "sample" values must match the column names of the expression matrix (order does not matter).
Variable information (var_info
):
Must be a tibble with a column named "variable" (variable ID).
Each value in "variable" must be unique.
The set of "variable" values must match the row names of the expression matrix (order does not matter).
The function will automatically reorder the expression matrix to match the order of "sample" and "variable" in the info tables.
Meta data
Other meta data can be added to the meta_data
attribute.
meta_data
is a list of additional information about the experiment.
Two meta data fields are required:
exp_type
: "glycomics" or "glycoproteomics"glycan_type
: "N" or "O"
Other meta data will be added by other glycoverse
packages for their own purposes.
Index columns
The index columns are the backbone that keep your data synchronized:
The "sample" column in
sample_info
must match the column names ofexpr_mat
.The "variable" column in
var_info
must match the row names ofexpr_mat
.
These columns act as unique identifiers, ensuring that your expression matrix, sample information, and variable information always stay in sync, no matter how you filter, arrange, or subset your data.
Examples
expr_mat <- matrix(runif(9), nrow = 3, ncol = 3)
colnames(expr_mat) <- c("S1", "S2", "S3")
rownames(expr_mat) <- c("V1", "V2", "V3")
sample_info <- tibble::tibble(sample = c("S1", "S2", "S3"), group = c("A", "B", "A"))
var_info <- tibble::tibble(variable = c("V1", "V2", "V3"), protein = c("P1", "P2", "P3"))
experiment(
expr_mat, sample_info, var_info,
exp_type = "glycoproteomics",
glycan_type = "N"
)
#>
#> ── Glycoproteomics Experiment ──────────────────────────────────────────────────
#> ℹ Expression matrix: 3 samples, 3 variables
#> ℹ Sample information fields: group <chr>
#> ℹ Variable information fields: protein <chr>