Mixture Model clustering — run

Performs expression based clustering on genes. Uses the coseq package to fit either Poisson or Gaussian mixtures to genes clusters, estimating their multivariate distribution parameters via an EM algorithm. Several numbers of clusters can be tested, and evaluated in terms of of likelihood to help the user in the decision

run_coseq(
  conds,
  genes,
  data,
  K = 6:12,
  transfo = "none",
  model = "Poisson",
  seed = NULL
)

Arguments

conds	Condition names to be used for clustering. Must be a unique vector containing the conditions you want to consider for gene clustering, without the replicate information (string before the underscore in sample names)
genes	Genes used as an input for the clustering. They must be present in the row names of data.
data	normalized counts with genes as rownames and samples as columns
K	range of number of clusters to test.
transfo	Transformation to apply to normalized counts before modeling with "Normal" Mixture Models. It must be : “arcsin”, “logit”, “logMedianRef”, “profile”, “logclr”, “clr”, “alr”, “ilr”, or “none”. For "Poisson", no transformation will be used, this argument will be ignored.
model	Model to use for mixture models : to choose between Poisson or Normal.
seed	seed for random state to ensure reproducible runs

Value

Named list containing the coseq run result as "model", and the cluster membership for each gene as "membership".

Examples

data("abiotic_stresses")
genes <- abiotic_stresses$heat_DEGs
clustering <- run_coseq(conds = unique(abiotic_stresses$conditions), 
data = abiotic_stresses$normalized_counts, genes = genes, K = 6:9)
#> ****************************************
#> coseq analysis: Poisson approach & none transformation
#> K = 6 to 9 
#> Use seed argument in coseq for reproducible results.
#> ****************************************
#> Running g = 6 ...
#> Running g = 7 ...
#> Running g = 8 ...
#> Running g = 9 ...