Perform permutation-based testings on a sample of permuted input scores
using candidate_search
as the main iterative function for each run.
CaDrA(
FS,
input_score,
method = c("ks_pval", "ks_score", "wilcox_pval", "wilcox_score", "revealer", "custom"),
custom_function = NULL,
custom_parameters = NULL,
alternative = c("less", "greater", "two.sided"),
weight = NULL,
top_N = 1,
search_start = NULL,
search_method = c("both", "forward"),
max_size = 7,
n_perm = 1000,
obs_best_score = NULL,
smooth = TRUE,
plot = TRUE,
ncores = 1,
cache_path = NULL,
verbose = FALSE
)
a SummarizedExperiment object containing binary features where rows represent features of interest (e.g. genes, transcripts, exons, etc...) and columns represent the samples.
a vector of continuous scores representing a phenotypic readout of interest such as protein expression, pathway activity, etc.
NOTE: input_score
object must have names or labels that match the column
names of FS
object.
a character string specifies a scoring method that is
used in the search. There are 6 options: ("ks_pval"
or ks_score
or "wilcox_pval"
or wilcox_score
or
"revealer"
(conditional mutual information from REVEALER) or
"custom"
(a customized scoring method)).
Default is ks_pval
.
if method is "custom"
, specifies
the name of the customized function here. Default is NULL
.
NOTE: custom_function() must take FS
and input_score
as its input arguments, and its final result must return a vector of row-wise
scores ordered from most significant to least significant where its labels or
names matched the row names of FS
object.
if method is "custom"
, specifies a list of
additional arguments (excluding FS
and input_score
) to be
passed to custom_function
. Default is NULL
.
a character string specifies an alternative hypothesis
testing ("two.sided"
or "greater"
or "less"
).
Default is less
for left-skewed significance testing.
NOTE: this argument only apply to KS and Wilcoxon method
if method is ks_score
, specifies a vector of weights
to perform a weighted-KS testing. Default is NULL
.
an integer specifies the number of features to start the
search over. By default, it starts from the top best feature (top_N = 1).
NOTE: If top_N
is provided, then search_start
parameter
will be ignored.
a list of character strings (separated by commas)
which specifies feature names within the FS object to start
the search with. If search_start
is provided, then top_N
parameter will be ignored. Default is NULL
.
a character string specifies an algorithm to filter out
the best candidates ("forward"
or "both"
). Default is
both
(i.e., backward and forward).
an integer specifies a maximum size that a meta-feature can
extend to do for a given search. Default is 7
.
an integer specifies the number of permutations to perform.
Default is 1000
.
a numeric value corresponding to the best observed
score. This value is used to compare against the permuted best scores.
Default is NULL
. If set to NULL, we will compute the observed
best score based on the given parameters.
a logical value indicates whether or not to smooth the p-value
calculation to avoid p-value of 0. Default is TRUE
.
a logical value indicates whether or not to plot the empirical
null distribution of the permuted best scores. Default is TRUE
.
an integer specifies the number of cores to perform
parallelization for permutation-based testing. Default is 1
.
a full path uses to cache the permuted best scores.
We recycle these scores instead of re-computing them to save time.
Default is NULL
. If NULL, the cache path is set to ~/.Rcache
for future loading.
a logical value indicates whether or not to print the
diagnostic messages. Default is FALSE
.
a list of key parameters that are used to cache the result of
permutation-based testing, a vector of permuted best scores for a given
n_perm
, an observed best score, and a permutation p-value.
# Load pre-computed feature set
data(sim_FS)
# Load pre-computed input-score
data(sim_Scores)
# Define additional parameters and start the function
cadra_result <- CaDrA(
FS = sim_FS, input_score = sim_Scores, method = "ks_pval",
weight = NULL, alternative = "less", top_N = 1,
search_start = NULL, search_method = "both", max_size = 7,
n_perm = 10, plot = FALSE, smooth = TRUE, obs_best_score = NULL,
ncores = 1, cache_path = NULL
)
#> Setting cache root path as: /home/runner/.cache/R/R.cache
#>
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%