During network inference, the importance of regressors is assessed for aech target gene. But regression methods are very sensitive to correlation between predictive variables, that should be dealt with for stability reasons. To account for this, this function summarizes the expression of the regulators being correlated above a certain threshold as new variables, being the mean of several correlated profiles.
To do so, a graph of all regressors correlated (spearman) above the threshold id built, and groups are formed with a community detection algorithm. Each group is then averaged in a single variable. Strongly negatively correlated regressors are added to the group it is correlated to, but their expression is not taken into account in the summary mean profile.
group_regressors(normalized.count, genes, regressors, corr_thr = 0.9)
normalized.count | normalized expression matrix containing genes and regressors. |
---|---|
genes | target genes that want to be used in the inference process |
regressors | regressor genes to be used in the inference process. |
corr_thr | correlation threshold to be used for regressors grouping. |
a named list.
counts : the normalized expression data containing the new summarized variables, in the format mean_geneID1-geneID2... The individual genes that were grouped are removed.
correlated_regressors_graph : visNetwork data of the correlated regulators
grouped_genes : new vector of target genes, with individual correlated genes replaced by groups
grouped_regressors : new vector of regressors, with individual correlated genes replaced by groups
data(abiotic_stresses) aggregated_data <- aggregate_splice_variants(abiotic_stresses$normalized_counts) genes <- get_locus(abiotic_stresses$heat_DEGs) regressors <- intersect(genes, regulators_per_organism[["Arabidopsis thaliana"]]) grouping <- DIANE::group_regressors(aggregated_data, genes, regressors)#> [1] "adding tf AT1G74890 to group 2 because correlation of -0.912173913043478 to mean" #> [1] 19 #> [1] "adding tf AT5G59570 to group 4 because correlation of -0.919130434782609 to mean" #> [1] 18#> [1] "counts" "correlated_regressors_graph" #> [3] "grouped_genes" "grouped_regressors"