Read Byonic-pGlycoQuant result

If you used Byonic for intact glycopeptide identification, and used pGlycoQuant for quantification, this is the function for you. It reads in a pGlycoQuant result file and returns a glyexp::experiment() object. Currently only label-free quantification is supported.

Usage

read_byonic_pglycoquant(
  fp,
  sample_info = NULL,
  quant_method = "label-free",
  glycan_type = "N",
  sample_name_converter = NULL,
  orgdb = "org.Hs.eg.db",
  parse_structure = TRUE
)

Arguments

fp: File path of the pGlycoQuant result file.
sample_info: File path of the sample information file (csv), or a sample information data.frame/tibble.
quant_method: Quantification method. Either "label-free" or "TMT".
glycan_type: Glycan type. One of "N", "O-GalNAc", "O-GlcNAc", "O-Man", "O-Fuc", or "O-Glc". Default is "N".
sample_name_converter: A function to convert sample names from file paths. The function should take a character vector of old sample names and return new sample names. Note that sample names in sample_info should match the new names. If NULL, original names are kept.
orgdb: name of the OrgDb package to use for UniProt to gene symbol conversion. Default is "org.Hs.eg.db".
parse_structure: Logical. Whether to parse glycan structures. If TRUE, glycan structures are parsed and included in the var_info as glycan_structure column. If FALSE (default), structure parsing is skipped and structure-related columns are removed.

Value

An glyexp::experiment() object.

Which file to use?

You should use the "Quant.spectra.list" file in the pGlycoQuant result folder. Files from Byonic result folder are not needed. For instructions on how to use Byonic and pGlycoQuant, please refer to the manual: pGlycoQuant.

Multisite glycopeptides

Multisite glycopeptides are supported but their protein_site will be set to NA since the exact site of glycosylation cannot be determined unambiguously.

Variable information

The following columns could be found in the variable information tibble:

peptide: character, peptide sequence
peptide_site: integer, site of glycosylation on peptide
protein: character, protein accession
protein_site: integer, site of glycosylation on protein
gene: character, gene name (symbol)
glycan_composition: glyrepr::glycan_composition(), glycan compositions.
glycan_structure: glyrepr::glycan_structure(), glycan structures (if parse_structure = TRUE).

Sample information

The sample information file should be a csv file with the first column named sample, and the rest of the columns being sample information. The sample column must match the RawName column in the pGlyco3 result file, although the order can be different.

You can put any useful information in the sample information file. Recommended columns are:

group: grouping or conditions, e.g. "control" or "tumor", required for most downstream analyses
batch: batch information, required for batch effect correction

Aggregation

pGlyco3 performs quantification on the PSM level. This level of information is too detailed for most downstream analyses. This function aggregate PSMs into glycopeptides through summation. For each glycopeptide (unique combination of "peptide", "peptide_site", "protein", "protein_site", "gene", "glycan_composition", "glycan_structure"), we sum up the quantifications of all PSMs that belong to this glycopeptide.