R/fct_edge_testing.R
estimate_test_edges_time.Rd
Estimates to running time for test_edges function, depending on its arguments. This is useful as the test_edges function can be quite long to complete.
estimate_test_edges_time( mat, normalized_counts, nGenes, nRegulators, density = 0.02, nTrees = 1000, nShuffle = 1000, nCores = ifelse(is.na(parallel::detectCores()), 1, max(parallel::detectCores() - 1, 1)), verbose = TRUE )
mat | matrix containing the importance values for each target and regulator (preferably computed with GENIE3 and the OOB importance metric) |
---|---|
normalized_counts | normalized expression data containing the genes present in mat argument, and such as used for the first network inference step. |
nGenes | number of total genes in the network, union of thetarget genes, and regulators |
nRegulators | number of regulators used for the network inference step |
density | approximate desired density, that will be used to build a first network, which edges are the one to be statistically tested. Default is 0.02. Biological networks are known to have densities (ratio of edges over total possible edges in the graph) between 0.1 and 0.001. The number of genes and regulators are needed to compute the density. |
nTrees | number of trees used for random forest importance computations |
nShuffle | number of times the response variable (target gene expression) is randomized in order to estimate the null distribution of the predictive variables (regulators) importances. |
nCores | Number of CPU cores to use during the procedure. Default is the detected number of cores minus one. |
verbose | If set to TRUE, a feedback on the progress of the calculations is given. Default: TRUE |
time in seconds
links: a dataframe containing the links of the network before testing, as built from the user defined prior density. All edges are associated to their pvalue and fdr-adjusted pvalue.
fdr_nEdges_curve : relation between the fdr threshold, and the final number of edges in the final network
if (FALSE) { data("abiotic_stresses") data("gene_annotations") data("regulators_per_organism") genes <- get_locus(abiotic_stresses$heat_DEGs) regressors <- intersect(genes, regulators_per_organism$`Arabidopsis thaliana`) data <- aggregate_splice_variants(abiotic_stresses$normalized_counts) r <- DIANE::group_regressors(data, genes, regressors) mat <- DIANE::network_inference(r$counts, conds = abiotic_stresses$conditions, targets = r$grouped_genes, regressors = r$grouped_regressors, importance_metric = "MSEincrease_oob", verbose = TRUE) res <- DIANE::estimate_test_edges_time(mat, normalized_counts = r$counts, density = 0.02, nGenes = length(r$grouped_genes), nRegulators = length(r$grouped_regressors), nTrees = 1000, verbose = TRUE) }