The function corrects for different sequencing depths bewteen samples. It relies on the TCC package, to build a TCC-class object containing the raw counts and the conditions for each sample. The function calcNormFactors is then applied, and uses the method chosen by the user. It can, weather or not, proceed to a first step or removing potentially differentially expressed genes to have less biased normalisation factors in the second normalization step. It returns a TCC object, with an element norm_factors containing the computed normalization factors.

normalize(
  data,
  conditions = stringr::str_split_fixed(colnames(data), "_", 2)[, 1],
  norm_method = "tmm",
  deg_method = "deseq2",
  fdr = 0.01,
  iteration = TRUE
)

Arguments

data

raw counts to be normalized (data frame or matrix), with genes as rownames and conditions as columns.

conditions

condition of each column of the data argument. Default is all the conditions in the experiment. (as defined by the underscore prefixes).

norm_method

method used for normalization, between tmm or deseq2

deg_method

method used for DEGs detection if chosen, between edgeR ou deseq2

fdr

pvalue threshold for adjusted pvalues for DEGs detection if chosen

iteration

weather or not to perform a prior removal of DEGs (TRUE or FALSE)

Value

a TCC-Class object

Details

Filtering low counts is highly recommended after normalization, consider using the DIANE::filter_low_counts function just after this function.

You can get the normalized expression matrix with TCC::getNormalizedData(tcc), tcc being the result of DIANE::normalize() or DIANE::filter_low_counts()

Examples

data("abiotic_stresses") tcc_object <- DIANE::normalize(abiotic_stresses$raw_counts, abiotic_stresses$conditions, iteration = FALSE)