Skip to contents

pGlyco3 is a software for intact glycopeptide identification and quantification. This function reads in the result file and returns a glyexp::experiment() object. Currently only label-free quantification is supported.

Usage

read_pglyco3(
  fp,
  sample_info = NULL,
  quant_method = c("label-free", "TMT"),
  glycan_type = c("N", "O"),
  sample_name_converter = NULL
)

Arguments

fp

File path of the pGlyco3 result file.

sample_info

File path of the sample information file (csv), or a sample information data.frame/tibble.

quant_method

Quantification method. Either "label-free" or "TMT".

glycan_type

Glycan type. Either "N" or "O". Default is "N".

sample_name_converter

A function to convert sample names from file paths. The function should take a character vector of old sample names and return new sample names. Note that sample names in sample_info should match the new names. If NULL, original names are kept.

Value

An glyexp::experiment() object.

Which file to use?

You should use the result file from pGlyco3 that contains quantification information. The file should have columns including RawName, MonoArea, Peptide, Proteins, Genes, GlycanComposition, PlausibleStruct, GlySite, and ProSites.

Sample information

The sample information file should be a csv file with the first column named sample, and the rest of the columns being sample information. The sample column must match the RawName column in the pGlyco3 result file, although the order can be different.

You can put any useful information in the sample information file. Recommended columns are:

  • group: grouping or conditions, e.g. "control" or "tumor", required for most downstream analyses

  • batch: batch information, required for batch effect correction

Protein inference

By default, this function automatically performs protein inference using the parsimony method to resolve shared glycopeptides. This converts the plural columns (proteins, genes, protein_sites) to singular equivalents (protein, gene, protein_site).

Aggregation

pGlyco3 performs quantification on the PSM level. This level of information is too detailed for most downstream analyses. This function aggregate PSMs into glycopeptides through summation. For each glycopeptide (unique combination of "peptide", "peptide_site", "protein", "protein_site", "gene", "glycan_composition", "glycan_structure"), we sum up the quantifications of all PSMs that belong to this glycopeptide.

Output

This function returns a glyexp::experiment() object.

The following columns could be found in the variable information tibble:

  • peptide: character, peptide sequence

  • peptide_site: integer, site of glycosylation on peptide

  • protein: character, protein accession (after protein inference)

  • protein_site: integer, site of glycosylation on protein (after protein inference)

  • gene: character, gene name (symbol) (after protein inference)

  • glycan_composition: glyrepr::glycan_composition(), glycan compositions.

  • glycan_structure: glyrepr::glycan_structure(), glycan structures.