EPX: An R package for the ensemble of subsets of variables for highly unbalanced binary classification

Computers in Biology and Medicine - Tập 136 - Trang 104760 - 2021
Grace G. Hsu1, Jabed H. Tomal2, William J. Welch1
1Department of Statistics, University of British Columbia, 3182 Earth Sciences Building, 2207 Main Mall, Vancouver, BC, V6T 1Z4, Canada
2Department of Mathematics and Statistics, Thompson Rivers University, 805 TRU Way, Kamloops, BC, V2C 0C8, Canada

Tài liệu tham khảo

Tomal, 2016, Exploiting multiple descriptor sets in qsar studies, J. Chem. Inf. Model., 56, 501, 10.1021/acs.jcim.5b00663 Tomal, 2019 Tomal, 2015, Ensembling classification models based on phalanxes of variables with applications in drug discovery, Ann. Appl. Stat., 9, 69, 10.1214/14-AOAS778 Breiman, 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324 2020 Breiman, 1984 Friedman, 2001, Greedy function approximation: a gradient boosting machine, Ann. Stat., 29, 1189, 10.1214/aos/1013203451 Wang, 2005 Burden, 1989, Molecular identification number for substructure searches, J. Chem. Inf. Comput. Sci., 29, 225, 10.1021/ci00063a011 Daguer, 2015, Dna display of fragment pairs as a tool for the discovery of novel biologically active small molecules, Chem. Sci., 6, 739, 10.1039/C4SC01654H Weston, 2020 Wallig, 2020 Liaw, 2018 Venables, 2002 Ripley, 2019 Greenwell, 2020 Tomal, 2021 Robin, 2021 Carhart, 1985, Atom pairs as molecular features in structure-activity studies: definition and applications, J. Chem. Inf. Comput. Sci., 25, 64, 10.1021/ci00046a002