CaDrA Search — CaDrA • CaDrA

Perform permutation-based testings on a sample of permuted input scores using candidate_search as the main iterative function for each run.

CaDrA(
  FS,
  input_score,
  method = c("ks_pval", "ks_score", "wilcox_pval", "wilcox_score", "revealer", "custom"),
  custom_function = NULL,
  custom_parameters = NULL,
  alternative = c("less", "greater", "two.sided"),
  weight = NULL,
  top_N = 1,
  search_start = NULL,
  search_method = c("both", "forward"),
  max_size = 7,
  n_perm = 1000,
  obs_best_score = NULL,
  smooth = TRUE,
  plot = TRUE,
  ncores = 1,
  cache_path = NULL,
  verbose = FALSE
)

Arguments

FS

a SummarizedExperiment object containing binary features where rows represent features of interest (e.g. genes, transcripts, exons, etc...) and columns represent the samples.

input_score

a vector of continuous scores representing a phenotypic readout of interest such as protein expression, pathway activity, etc.

NOTE: input_score object must have names or labels that match the column names of FS object.

method

a character string specifies a scoring method that is used in the search. There are 6 options: ("ks_pval" or ks_score or "wilcox_pval" or wilcox_score or "revealer" (conditional mutual information from REVEALER) or "custom" (a customized scoring method)). Default is ks_pval.

custom_function

if method is "custom", specifies the name of the customized function here. Default is NULL.

NOTE: custom_function() must take FS and input_score as its input arguments, and its final result must return a vector of row-wise scores ordered from most significant to least significant where its labels or names matched the row names of FS object.

custom_parameters

if method is "custom", specifies a list of additional arguments (excluding FS and input_score) to be passed to custom_function. Default is NULL.

alternative

a character string specifies an alternative hypothesis testing ("two.sided" or "greater" or "less"). Default is less for left-skewed significance testing. NOTE: this argument only apply to KS and Wilcoxon method

weight

if method is ks_score, specifies a vector of weights to perform a weighted-KS testing. Default is NULL.

top_N

an integer specifies the number of features to start the search over. By default, it starts from the top best feature (top_N = 1). NOTE: If top_N is provided, then search_start parameter will be ignored.

search_start

a list of character strings (separated by commas) which specifies feature names within the FS object to start the search with. If search_start is provided, then top_N parameter will be ignored. Default is NULL.

search_method

a character string specifies an algorithm to filter out the best candidates ("forward" or "both"). Default is both (i.e., backward and forward).

max_size

an integer specifies a maximum size that a meta-feature can extend to do for a given search. Default is 7.

n_perm

an integer specifies the number of permutations to perform. Default is 1000.

obs_best_score

a numeric value corresponding to the best observed score. This value is used to compare against the permuted best scores. Default is NULL. If set to NULL, we will compute the observed best score based on the given parameters.

smooth

a logical value indicates whether or not to smooth the p-value calculation to avoid p-value of 0. Default is TRUE.

plot

a logical value indicates whether or not to plot the empirical null distribution of the permuted best scores. Default is TRUE.

ncores

an integer specifies the number of cores to perform parallelization for permutation-based testing. Default is 1.

cache_path

a full path uses to cache the permuted best scores. We recycle these scores instead of re-computing them to save time. Default is NULL. If NULL, the cache path is set to ~/.Rcache for future loading.

verbose

a logical value indicates whether or not to print the diagnostic messages. Default is FALSE.

Value

a list of key parameters that are used to cache the result of permutation-based testing, a vector of permuted best scores for a given n_perm, an observed best score, and a permutation p-value.

Author

Reina Chau

Examples


# Load pre-computed feature set
data(sim_FS)

# Load pre-computed input-score
data(sim_Scores)

# Define additional parameters and start the function
cadra_result <- CaDrA(
  FS = sim_FS, input_score = sim_Scores, method = "ks_pval", 
  weight = NULL, alternative = "less", top_N = 1,
  search_start = NULL, search_method = "both", max_size = 7, 
  n_perm = 10, plot = FALSE, smooth = TRUE, obs_best_score = NULL,
  ncores = 1, cache_path = NULL
)
#> Setting cache root path as: /home/runner/.cache/R/R.cache
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |======================================================================| 100%