R/fct_network_inference.R
network_inference.Rd
GENIE3 needs to be given a list of genes, that will be the nodes of the inferred network. Among those genes, some must be considered as potential regulators. GENIE3 can determine the influence if every regulators over each input genes, using their respective expression profiles. You can specify which conditions you want to be considered for those profiles during the network inference. For each target gene, the methods uses Random Forests to provide a ranking of all regulators based on their influence on the target expression. This ranking is then merged across all targets, giving a global regulatory links ranking stored in the result matrix.
network_inference( normalized.count, conds, regressors, targets, nTrees = 1000, nCores = ifelse(is.na(parallel::detectCores()), 1, max(parallel::detectCores() - 1, 1)), verbose = TRUE, importance_metric = "node_purity" )
normalized.count | normalized expression matrix containing the regressors and target genes in its rows, and samples a columns |
---|---|
conds | condition names to be used in the inference (not columns names, conditions names before the underscore) |
regressors | genes to be taken as regressors during the inference procedures (regulator genes) |
targets | genes to be included in the inferred network. Regressors can also be in the targets |
nTrees | Number of trees by Random Forest |
nCores | Number of CPU cores to use during the procedure. Default is the detected number of cores minus one. |
verbose | If set to TRUE, a feedback on the progress of the calculations is given. Default: TRUE |
importance_metric | character being either node_purity or MSEincrease_oob. This is the importance type computed for the regulator-gene pairs, as returned by the randomForest package. Default is node_purity, the metric used in GENIE3. Our improvement of the method uses MSEincrease_oob for consistency reasons regarding to statistical edges testing. The default one is around 4 times fatser, but more sensitive to the number of regulators and to over-fitting. Too few samples will lead to NA in MSEincrease_oob, so in that case, it is advised to used GENIE3's default one. |
Matrix filled with regulator-target regulatory weights
if (FALSE) { data("abiotic_stresses") data("regulators_per_organism") aggregated_data <- aggregate_splice_variants(data = abiotic_stresses$normalized_counts) genes <- get_locus(abiotic_stresses$heat_DEGs) regressors <- intersect(genes, regulators_per_organism[["Arabidopsis thaliana"]]) mat <- network_inference(aggregated_data, conds = abiotic_stresses$conditions, targets = genes, regressors = regressors, nTrees = 1000, nCores = 4) }