Normalize raw count data — normalize • DIANE

The function corrects for different sequencing depths bewteen samples. It relies on the TCC package, to build a TCC-class object containing the raw counts and the conditions for each sample. The function calcNormFactors is then applied, and uses the method chosen by the user. It can, weather or not, proceed to a first step or removing potentially differentially expressed genes to have less biased normalisation factors in the second normalization step. It returns a TCC object, with an element norm_factors containing the computed normalization factors.

normalize(
  data,
  conditions = stringr::str_split_fixed(colnames(data), "_", 2)[, 1],
  norm_method = "tmm",
  deg_method = "deseq2",
  fdr = 0.01,
  iteration = TRUE
)

Arguments

data	raw counts to be normalized (data frame or matrix), with genes as rownames and conditions as columns.
conditions	condition of each column of the data argument. Default is all the conditions in the experiment. (as defined by the underscore prefixes).
norm_method	method used for normalization, between tmm or deseq2
deg_method	method used for DEGs detection if chosen, between edgeR ou deseq2
fdr	pvalue threshold for adjusted pvalues for DEGs detection if chosen
iteration	weather or not to perform a prior removal of DEGs (TRUE or FALSE)

Value

a TCC-Class object

Details

Filtering low counts is highly recommended after normalization, consider using the DIANE::filter_low_counts function just after this function.

You can get the normalized expression matrix with TCC::getNormalizedData(tcc), tcc being the result of DIANE::normalize() or DIANE::filter_low_counts()

Examples

data("abiotic_stresses")
tcc_object <- DIANE::normalize(abiotic_stresses$raw_counts, 
abiotic_stresses$conditions, iteration = FALSE)