Poisson regression models gene expression (Y) as a function of gene mean and sample covariates (X) mu = beta * X Y ~ Poisson(mu)

compute_gene_deviance(data, family = "poisson", covar = NULL,
  precision = c("double", "single"), verbose = FALSE,
  optimiser = greta::adam(), max_iterations = 5000,
  tolerance = 1e-06)

poisson_regression(data, covar = NULL, beta_mean = 0, beta_sd = 3,
  precision = c("double", "single"))

Arguments

data

data matrix (data points * dimensions), can be dense and sparse matrix, SummarizedExperiment/SingleCellExperiment, Seurat (counts slot is used). poisson_regression() accepts only a dense matrix at the moment (limitation of greta).

family

character naming the data distribution

covar

matrix (data points * covariates) or vector of column names (for compute_gene_deviance() and SingleCellExperiment, Seurat) containing covariates affecting expression in addition to gene mean (coverage, batch). Adding this will find genes whose deviance (residuals) is unexplained both by these covariates and Poisson noise (covar = NULL tests Poisson noise alone).

precision

argument for model. Use "single" for large datasets to reduce memory footprint

verbose

logical, plot greta model structure and print messages?

optimiser

method to use for finding regression coefficients and deviance when adding covariates see opt

max_iterations

number of iterations to run optimiser for.

tolerance

the numerical tolerance for the solution, the optimiser stops when the (absolute) difference in the joint density between successive iterations drops below this level.

beta_mean

prior mean for coefficients

beta_sd

prior sd for coefficients, use small values to regularise (e.g. penalise coefficients that deviate too much from 0)

Value

compute_gene_deviance(): list containing the deviance vector with dimension names (genes) as names, beta coefficient matrix (dimensions * coeffs) and greta model used to compute those. For SingleCellExperiment the same object with beta coeffecients and deviance as rowData is returned. For Seurat the same object is returned updated with beta coeffecients and deviance in Seurat::GetAssay(obj, "RNA")@meta.features.

poisson_regression(): R environment containing the model and parameters as greta arrays

Examples

# Use fake data as example # Random data that fits into the triangle set.seed(4355) arc_data = generate_arc(arc_coord = list(c(7, 3, 10), c(12, 17, 11), c(30, 20, 9)), mean = 0, sd = 1) data = generate_data(arc_data$XC, N_examples = 1e4, jiiter = 0.04, size = 0.9) # Take Poisson sample with the mean defined by each entry of the data matrix # (this create Poisson-distributed positive integer data) data = matrix(rpois(length(data), (data)), nrow(data), ncol(data)) # Compute deviance from the mean (residuals for Poisson data) dev = compute_gene_deviance(t(data)) # As you can see, the third dimension has lowest deviance dev
#> $deviance #> [1] 19179.31 16610.19 10733.26 #> #> $beta #> beta_mean #> [1,] 2.639314 #> [2,] 2.546119 #> [3,] 2.158645 #> #> $model #> NULL #>
# because the vertices of the triangle have almost identical position in third dimension. plot_arc(arc_data = arc_data, data = data, which_dimensions = 1:3, data_alpha = 0.5)
#> No trace type specified: #> Based on info supplied, a 'scatter3d' trace seems appropriate. #> Read more about this trace type -> https://plot.ly/r/reference/#scatter3d
#> A marker object has been specified, but markers is not in the mode #> Adding markers to the mode...
# You can use deviance to find which dimension have variability to be explained with Archetypal Analysis # Create a probabilistic Poisson regression model with greta # to study effects of covariates on Poisson data (requires greta installed)
# NOT RUN { model = poisson_regression(t(data), covar = matrix(rnorm(ncol(data)), ncol(data), 1)) # plot the structure of tensorflow computation graph plot(model$model) # find parameters using adam optimiser res = greta::opt(model$model, optimiser = greta::adam(), max_iterations = 500) # did the model converge before 500 iterations? res$iterations # Value of Poisson negative log likelihood (see greta documentation for details) res$value # View beta parameters for each dimension (columns), log(mean) in the first row, # covariate coefficients in the subsequent rows res$par$beta # }