find_set_activity_AUCell() finds activity of each gene set in each cell by combining AUCell functions into a pipeline. Aerts lab that developed AUCell recommends adjusting the threshold when binarising activities (Details: https://www.bioconductor.org/packages/devel/bioc/vignettes/AUCell/inst/doc/AUCell.html#determine-the-cells-with-the-given-gene-signatures-or-active-gene-sets). See documentations for details: AUCell_buildRankings, AUCell_calcAUC, AUCell_exploreThresholds.

find_set_activity_pseudoinv() finds activity of each gene set in each cell by solving this matrix equation: expression = activities x gene_assignment_to_sets => activities = pseudoinverse(gene_assignment_to_sets) x expression. Based on code from Inferelator package by Richard Bonneau lab.

find_set_activity_AUCell(expr_mat, assay_name = "logcounts",
  aucMaxRank = nrow(expr_mat) * 0.05, gene_sets, gene_col = "ALIAS",
  set_id_col = "GOALL", set_name_col = "TERM", binary = FALSE,
  nCores = 1, plotHist = FALSE, plotStats = TRUE, ...)

find_set_activity_pseudoinv(expr_mat, assay_name = "logcounts",
  gene_sets, gene_col = "ALIAS", set_id_col = "GOALL",
  set_name_col = "TERM", noself = FALSE)

Arguments

expr_mat

expression matrix (genes in rows, cells in columns) or one of: dgCMatrix, ExpressionSet, and SummarizedExperiment or SingleCellExperiment both of which require assay_name.

assay_name

name of assay in SummarizedExperiment or SingleCellExperiment, normally counts or logcounts

aucMaxRank

argument for AUCell_calcAUC. Threshold to calculate the AUC.In a simplified way, the AUC value represents the fraction of genes, within the top X genes in the ranking, that are included in the signature. The parameter 'aucMaxRank' allows to modify the number of genes (maximum ranking) that is used to perform this computation. By default, it is set to 0.05 of the total number of genes in the rankings. Common values may range from 0.01 to 0.3.

gene_sets

data.table or coercible to data.table that contains gene set annotations.

gene_col

column in gene_sets storing gene identifiers.

set_id_col

column in gene_sets storing set identifiers.

set_name_col

column in gene_sets storing readable set names.

binary

binarise gene set activities using AUCell_exploreThresholds?

nCores

number of cores for parallel processing. See AUCell docs for details.

plotHist

plot the AUC histograms? AUCell_exploreThresholds.

...

other arguments passed to AUCell_exploreThresholds.

noself

Remove self-interactions from set annotations (when sets are TF targets)

Value

find_set_activity_AUCell() data.table of gene set activities with cell in rows and gene sets in columns. Column titled "cells" contains cell ids (column names of expr_mat).