• The goal of select pacakge is to detect Evolutionary Dependecies between alterations/genes in cancer.
  • select package provides function to generate the backgorund model and other utilites functions.


  • You can install the development version of select from GitHub with:
# install.packages("devtools")
# devtools::install_github("CSOgroup/select")


  • We will run select algorithm on processed LUAD dataset from TCGA provided with the package.
Data Description & Format

  • The loaded data is list object which consists of
    • gam: a presence absence matrix of alterations
    • samples: a named vector of sample annotations
    • alt: a named vector of alteration annotations
# Check the data strucutre
## Load the data provided with the package
data(luad_data, package = "select")
#> List of 3
#>  $ gam    : num [1:502, 1:659] 0 0 0 0 0 0 0 0 1 0 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:502] "TCGA-05-4244-01" "TCGA-05-4249-01" "TCGA-05-4250-01" "TCGA-05-4382-01" ...
#>   .. ..$ : chr [1:659] "MUT.ABL1" "MUT.ACVR1B" "MUT.ACVR2A" "MUT.AKT1" ...
#>  $ alt    : Named chr [1:659] "MUT" "MUT" "MUT" "MUT" ...
#>   ..- attr(*, "names")= chr [1:659] "MUT.ABL1" "MUT.ACVR1B" "MUT.ACVR2A" "MUT.AKT1" ...
#>  $ samples: Named chr [1:502] "LUAD" "LUAD" "LUAD" "LUAD" ...
#>   ..- attr(*, "names")= chr [1:502] "TCGA-05-4244-01" "TCGA-05-4249-01" "TCGA-05-4250-01" "TCGA-05-4382-01" ...

Running select

  • We use the function select() which generates the background model and results.
  • The parameters for the functions are:
    • M: the list object of GAMs & TMB
    • sample.class: a named vector of samples with covariates
    • alteration.class: a named vector of alteration with covariates
    • many more
  • The function returns a dataframe.
alpi <- select::select(
  M = luad_data$gam,
  sample.class = luad_data$samples,
  alteration.class = luad_data$alt,
  folder = './',
  r.seed = 110,
  n.cores = 1,
  vetos = NULL,
  n.permut = 100,
  rho = 0.1,
  lambda = 15,
  save.intermediate.files = FALSE,
  randomization.switch.threshold = 30,
  calculate_APC_threshold = TRUE,
  calculate_FDR = TRUE,
  verbose = TRUE
Intrepreting the results

  • Lets look into the results
  • The below table explains what each coloum means.
Colnames Meaning
SFE_1 Selected Functional Event (SFE_1)
SFE_2 Selected Functional Event (SFE_2)
name Interaction Motif
type_1 Type of mutation of SFE_1
type_2 Type of mutation of SFE_2
int_type Interaction Motif type
support_1 Samples mutated with SFE_1
support_2 Samples mutated with SFE_2
freq_1 Frequency of SFE_1
freq_2 Frequency of SFE_2
overlap Co-mutation between SFE_1 and SFE_2
max_overlap Maximum possible Co-mutation
freq_overlap Frequency of Co-mutation
r_overlap Background Co-mutation
r_freq_overlap Background frequency of Co-mutation
diff_overlap Difference of co-mutations
abs_diff_overlap Absoulte Difference of co-mutations
direction Interaction Type
wMI_stat Weighted Mutual Information
wMI_p.value P-value on Mutual Information
ME_p.value P-value on co-mutation
E.r.wMI_stat Background Weighted Mutual Information
MI_diff Difference of Mutual Information
wMI_p.value_FDR FDR
select_score_good_cancer_cell_2017_criterion_1 Cancer Cell Paper Criteria
select_score Effect Size (select Score)
# Look into dataframe
(alpi %>% filter(wMI_p.value_FDR) %>% arrange(desc(select_score))) %>% head(2) 
#>                                                                                                                  SFE_1
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493    AMP.consensus.chr14:35870717-36159897
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876 DEL.consensus.chr4:183089197-186421724
#>                                                                                                                  SFE_2
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493    AMP.consensus.chr14:37858832-38371493
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876 DEL.consensus.chr4:187186290-187647876
#>                                                                                                                                                            name
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493     AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876 DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876
#>                                                                                 type_1 type_2
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493      AMP    AMP
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876    DEL    DEL
#>                                                                                  int_type support_1
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493   AMP - AMP        57
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876 DEL - DEL        12
#>                                                                                 support_2
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493          53
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876        12
#>                                                                                     freq_1
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493   0.11656442
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876 0.02453988
#>                                                                                     freq_2 overlap
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493   0.10838446      47
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876 0.02453988      11
#>                                                                                 max_overlap
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493            53
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876          12
#>                                                                                 freq_overlap
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493      0.8867925
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876    0.9166667
#>                                                                                 r_overlap
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493       10.04
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876      0.72
#>                                                                                 r_freq_overlap
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493         0.189434
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876       0.060000
#>                                                                                 diff_overlap
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493          36.96
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876        10.28
#>                                                                                 abs_diff_overlap
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493              36.96
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876            10.28
#>                                                                                 direction  wMI_stat
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493          CO 0.4241418
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876        CO 0.3612067
#>                                                                                 wMI_p.value
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493             0
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876           0
#>                                                                                 ME_p.value
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493            0
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876          0
#>                                                                                 E.r.wMI_stat
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493    0.003527209
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876  0.016350097
#>                                                                                   MI_diff
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493   0.4206146
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876 0.3448566
#>                                                                                 wMI_p.value_FDR
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493              TRUE
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876            TRUE
#>                                                                                 select_score_good_cancer_cell_2017_criterion_1
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493                                             TRUE
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876                                           TRUE
#>                                                                                 select_score
#> AMP.consensus.chr14:35870717-36159897 - AMP.consensus.chr14:37858832-38371493      0.4005788
#> DEL.consensus.chr4:183089197-186421724 - DEL.consensus.chr4:187186290-187647876    0.3413985
# Total significant Hits
alpi  %>% filter(wMI_p.value_FDR) %>% count(wMI_p.value_FDR,direction)
#>   wMI_p.value_FDR direction   n
#> 1            TRUE        CO 108
#> 2            TRUE        ME  18


  • Please go through the methods on best way to filter and use the method as mentioned in the paper.
    • Mina, M., Iyer, A., Tavernari, D., Raynaud, F., & Ciriello, G. (2020). Discovering functional evolutionary dependencies in human cancers. Nature Genetics, 52(11), 1198-1207.


