Skip to contents

Performs glycan-centric Network of Cancer Genes (NCG) Over-Representation Analysis (ORA). Instead of traditional protein-centric enrichment, this function links specific glycan traits to cancer gene associations. It helps answer questions like "Which cancer gene sets are enriched in proteins with a specific dysregulated glycan motif?", by grouping differential analysis results by glycan traits and computing cancer gene enrichment for each trait.

Usage

enrich_gc_ora_ncg(
  dea_res,
  dea_p_cutoff = 0.05,
  dea_log2fc_cutoff = c(-1, 1),
  universe = NULL,
  p_adj_method = "BH",
  p_cutoff = 0.05,
  q_cutoff = 0.2
)

Arguments

dea_res

Differential analysis result. Can be one of:

dea_p_cutoff

P-value cutoff for statistical significance. Defaults to 0.05. For glystats result input, adjusted p-values are used.

dea_log2fc_cutoff

Log2 fold change cutoff statistical significance. A length-2 numeric vector, being negative and positive boundaries, respectively. For example, c(-1, 1) means "log2FC < -1 or log2FC > 1", and c(-Inf, 1) means "log2FC > 1". Defaults to c(-1, 1).

universe

Background genes Uniprot IDs, directly passed to universe of clusterProfiler::enrichGO(). If NULL (default), all genes in the data will be used. Another common pattern is to use all detected proteins as backgroud genes. You can use detected_universe() to help you.

p_adj_method

Passed to pAdjustMethod of clusterProfiler::enrichGO().

p_cutoff

Passed to pvalueCutoff of clusterProfiler::enrichGO().

q_cutoff

Passed to qvalueCutoff of clusterProfiler::enrichGO().

Value

A list with two elements:

  • tidy_result: A tibble with enrichment results containing the following columns:

    • trait: Glycan trait

    • id: Term ID

    • description: Term description

    • gene_ratio: Ratio of genes in the term to total genes in the input

    • bg_ratio: Ratio of genes in the term to total genes in the background

    • rich_factor: Proportion of the term's total background genes found in the input

    • fold_enrichment: Ratio of gene_ratio to bg_ratio (magnitude of enrichment)

    • z_score: Directional trend of regulation (positive for up, negative for down)

    • p_val: Raw p-value from hypergeometric test

    • p_adj: Adjusted p-value

    • q_val: Q-value (FDR)

    • gene_id: Gene IDs in the term (separated by "/")

    • count: Number of genes in the term

  • raw_result: The raw clusterProfiler clusterProfResult object

What is glycan-centric enrichment?

In traditional glycoproteomics data analysis, we usually perform differential expression analysis (DEA) on glycoforms, extract proteins that have dysregulated glycosylation, then perform functional enrichment (e.g. GO) on these proteins. This is what enrich_xxx() functions do (e.g. enrich_ora_go()).

enrich_gc_xxx() functions differ in that they link specific glycan traits with functional annotations. Instead of answering the question "Which functions are enriched in dysregulated glycoproteins?", enrich_gc_xxx() answers questions like “Which functions are enriched in proteins with dysregulated core-fucosylation?” Higher specificity, deeper insights. By focusing on distinct glycan motifs, it helps you pinpoint the functional relevance of specific glycosylation changes.

Common usage pattern

A common pattern of using this function is:

# 1. Use `glydet` to calculate derived traits or motif quantification.
trait_exp <- derive_traits(exp)  # or `quantify_motifs()`

# 2. Perform differential analysis with `glystats`.
dea_res <- gly_ttest(trait_exp)

# 3. Use this function.
go_res <- enrich_gc_ora_go(dea_res)  # or other `enrich_gc_xxx()` functions