DeepStack-DTIs: Predicting Drug–Target Interactions Using LightGBM Feature Selection and Deep-Stacked Ensemble Classifier
Tóm tắt
Từ khóa
Tài liệu tham khảo
Agyemang B, Wu WP, Kpiebaareh MY, Lei Z, Nanor E, Chen L (2020) Multi-view self-attention for interpretable drug–target interaction prediction. J Biomed Inform 110:103547. https://doi.org/10.1016/j.jbi.2020.103547
Luo Y, Zhao X, Zhou J, Yang J, Zhang Y, Kuang W, Peng J, Chen L, Zeng J (2017) A network integration approach for drug–target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun 8(1):573. https://doi.org/10.1038/s41467-017-00680-8
Yuan Q, Gao J, Wu D, Zhang S, Mamitsuka H, Zhu S (2016) DrugE-Rank: improving drug–target interaction prediction of new candidate drugs or targets by ensemble learning to rank. Bioinformatics 32(12):i18–i27. https://doi.org/10.1093/bioinformatics/btw244
Zhao T, Hu Y, Valsdottir LR, Zang T, Peng J (2021) Identifying drug–target interactions based on graph convolutional network and deep neural network. Brief Bioinform 22(2):2141–2150. https://doi.org/10.1093/bib/bbaa044
Wang Y, Zeng J (2013) Predicting drug–target interactions using restricted Boltzmann machines. Bioinformatics 29(13):i126–i134. https://doi.org/10.1093/bioinformatics/btt234
Chen X, Yan CC, Zhang X, Zhang X, Dai F, Yin J, Zhang Y (2016) Drug–target interaction prediction: databases, web servers and computational models. Brief Bioinform 17(4):696–712. https://doi.org/10.1093/bib/bbv066
Dearden JC (2003) In silico prediction of drug toxicity. J Comput Aided Mol Des 17:119–127. https://doi.org/10.1023/A:1025361621494
Chu Y, Kaushik AC, Wang X, Wang W, Zhang Y, Shan X, Salahub DR, Xiong Y, Wei DQ (2021) DTI-CDF: a cascade deep forest model towards the prediction of drug–target interactions based on hybrid features. Brief Bioinform 22(1):451–462. https://doi.org/10.1093/bib/bbz152
Nascimento AC, Prudêncio RB, Costa IG (2016) A multiple kernel learning algorithm for drug–target interaction prediction. BMC Bioinform 17:46. https://doi.org/10.1186/s12859-016-0890-3
Sharma A, Rain R (2018) BE-DTI’: Ensemble framework for drug target interaction prediction using dimensionality reduction and active learning. Comput Methods Programs Biomed 165:151–162. https://doi.org/10.1016/j.cmpb.2018.08.011
Chu Y, Shan X, Chen T, Jiang M, Wang Y, Wang Q, Salahub DR, Xiong Y, Wei DQ (2021) DTI-MLCD: predicting drug–target interactions using multi-label learning with community detection method. Brief Bioinform 22(3):1–15. https://doi.org/10.1093/bib/bbaa205
Thafar MA, Olayan RS, Ashoor H, Albaradei S, Bajic VB, Gao X, Gojobori T, Essack M (2020) DTiGEMS+: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques. J Cheminform 12(1):44. https://doi.org/10.1186/s13321-020-00447-2
Ding Y, Tang J, Guo F (2020) Identification of drug–Target interactions via dual Laplacian regularized least squares with multiple kernel fusion. Knowl-Based Syst 204:106254. https://doi.org/10.1016/j.knosys.2020.106254
Li H, Gao Z, Kang L, Zhang H, Yang K, Yu K, Luo X, Zhu W, Chen K, Shen J, Wang X, Jiang H (2006) TarFisDock: a web server for identifying drug targets with docking approach. Nucleic Acids Res 34:W219–W224. https://doi.org/10.1093/nar/gkl114
Ezzat A, Wu M, Li XL, Kwoh CK (2019) Computational prediction of drug–target interactions using chemogenomic approaches: an empirical survey. Brief Bioinform 20(4):1337–1357. https://doi.org/10.1093/bib/bby002
Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K (2021) Machine learning approaches and databases for prediction of drug–target interaction: a survey paper. Brief Bioinform 22(1):247–269. https://doi.org/10.1093/bib/bbz157
Mousavian Z, Masoudi-Nejad A (2014) Drug-target interaction prediction via chemogenomic space: learning-based methods. Expert Opin Drug Metab Toxicol 10(9):1273–1287. https://doi.org/10.1517/17425255.2014.950222
Cheng F, Liu C, Jiang J, Lu W, Li W, Liu G, Zhou W, Huang J, Tang Y (2012) Prediction of drug-target interactions and drug repositioning via network-based inference. PLoS Comput Biol 8(5):e1002503. https://doi.org/10.1371/journal.pcbi.1002503
Manoochehri HE, Nourani M (2020) Drug-target interaction prediction using semi-bipartite graph model and deep learning. BMC Bioinform 21(S4):248. https://doi.org/10.1186/s12859-020-3518-6
Ding Y, Tang J, Guo F (2017) Identification of drug-target interactions via multiple information integration. Inform Sci 418:546–560. https://doi.org/10.1016/j.ins.2017.08.045
Huang YA, You ZH, Chen X (2018) A Systematic Prediction of drug-target interactions using molecular fingerprints and protein sequences. Curr Protein Pept Sci 19(5):468–478. https://doi.org/10.2174/1389203718666161122103057
Nakashima H, Nishikawa K (1994) Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. J Mol Biol 238(1):54–61. https://doi.org/10.1006/jmbi.1994.1267
Yap CW, Chen YZ (2005) Prediction of cytochrome P450 3A4, 2D6, and 2C9 inhibitors and substrates by using support vector machines. J Chem Inf Model 45(4):982–992. https://doi.org/10.1021/ci0500536
Wu G, Liu J, Yue X (2019) Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition. BMC Bioinform 20(S3):134. https://doi.org/10.1186/s12859-019-2644-5
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326. https://doi.org/10.1126/science.290.5500.2323
Zhang Y, Qiao S, Ji S, Han N, Liu D, Zhou J (2019) Identification of DNA–protein binding sites by bootstrap multiple convolutional neural networks on sequence information. Eng Appl Artif Intel 79:58–66. https://doi.org/10.1016/j.engappai.2019.01.003
Yu DJ, Hu J, Tang ZM, Shen HB, Yang J, Yang JY (2013) Improving protein-atp binding residues prediction by boosting svms with random under-sampling. Neurocomputing 104:180–190. https://doi.org/10.1016/j.neucom.2012.10.012
Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M (2008) Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24(13):i232–i240. https://doi.org/10.1093/bioinformatics/btn162
Wang L, You ZH, Chen X, Yan X, Liu G, Zhang W (2018) RFDT: a rotation forest-based predictor for predicting drug-target interactions using drug structure and protein sequence information. Curr Protein Pept Sci 19(5):445–454. https://doi.org/10.2174/1389203718666161114111656
Li Z, Han P, You ZH, Li X, Zhang Y, Yu H, Nie R, Chen X (2017) In silico prediction of drug-target interaction networks based on drug chemical structure and protein sequences. Sci Rep 7:11174. https://doi.org/10.1038/s41598-017-10724-0
Meng FR, You ZH, Chen X, Zhou Y, An JY (2017) Prediction of drug-target interaction networks from the integration of protein sequences and drug chemical structures. Molecules 22(7):1119. https://doi.org/10.3390/molecules22071119
Mahmud SMH, Chen W, Jahan H, Liu Y, Sujan NI, Ahmed S (2019) iDTi-CSsmoteB: identification of drug–target interaction based on drug chemical structure and protein sequence using XGBoost with over-sampling technique SMOTE. IEEE Access 7(2019):48699–48714. https://doi.org/10.1109/ACCESS.2019.2910277
Rayhan F, Ahmed S, Shatabda S, Farid DM, Mousavian Z, Dehzangi A, Rahman MS (2017) iDTI-ESBoost: identification of drug target interaction using evolutionary and structural features with boosting. Sci Rep 7:17731. https://doi.org/10.1038/s41598-017-18025-2
Yang Y, Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Zhou Y (2017) SPIDER2: a package to predict secondary structure, accessible surface area, and main-Chain torsional angles by deep neural networks. Methods Mol Biol 1484:55–63. https://doi.org/10.1007/978-1-4939-6406-2_6
Ezzat A, Wu M, Li XL, Kwoh CK (2016) Drug-target interaction prediction via class imbalance-aware ensemble learning. BMC Bioinform 17(S19):509. https://doi.org/10.1186/s12859-016-1377-y
Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V, Djoumbou Y, Eisner R, Guo AC, Wishart DS (2011) DrugBank 3.0: a comprehensive resource for “Omics” research on drugs. Nucleic Acids Res 39:D1035–D1041. https://doi.org/10.1093/nar/gkq1126
Shi H, Liu S, Chen J, Li X, Ma Q, Yu B (2019) Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure. Genomics 111(6):1839–1852. https://doi.org/10.1016/j.ygeno.2018.12.007
Mahmud SMH, Chen W, Meng H, Jahan H, Liu Y, Hasan SMM (2020) Prediction of drug-target interaction based on protein features using undersampling and feature selection techniques with boosting. Anal Biochem 589:113507. https://doi.org/10.1016/j.ab.2019.113507
Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34:D354–D357. https://doi.org/10.1093/nar/gkj102
Schomburg I, Chang A, Ebeling C, Gremse M, Heldt C, Huhn G, Schomburg D (2004) BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res 32:D431–D433. https://doi.org/10.1093/nar/gkh081
Günther S, Kuhn M, Dunkel M, Campillos M, Senger C, Petsalaki E, Ahmed J, Urdiales EG, Gewiess A, Jensen LJ, Schneider R, Skoblo R, Russell RB, Bourne PE, Bork P, Preissner R (2007) SuperTarget and Matador: resources for exploring drug-target relationships. Nucleic Acids Res 36:D919–D922. https://doi.org/10.1093/nar/gkm862
Kuang Q, Xu X, Li R, Dong Y, Li Y, Huang Z, Li Y, Li M (2015) An eigenvalue transformation technique for predicting drug-target interaction. Sci Rep 5:13867. https://doi.org/10.1038/srep13867
Yu B, Li S, Qiu W, Wang M, Du J, Zhang Y, Chen X (2018) Prediction of subcellular location of apoptosis proteins by incorporating PsePSSM and DCCA coefficient based on LFDA dimensionality reduction. BMC Genomics 19:478. https://doi.org/10.1186/s12864-018-4849-9
Liu Y, Yu Z, Chen C, Han Y, Yu B (2020) Prediction of protein crotonylation sites through LightGBM classifier based on SMOTE and elastic net. Anal Biochem 609:113903. https://doi.org/10.1016/j.ab.2020.113903
Qiu W, Li S, Cui X, Yu Z, Wang M, Du J, Peng Y, Yu B (2018) Predicting protein submitochondrial locations by incorporating the pseudo-position specific scoring matrix into the general Chou’s pseudo-amino acid composition. J Theor Biol 450:86–103. https://doi.org/10.1016/j.jtbi.2018.04.026
Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292(2):195–202. https://doi.org/10.1006/jmbi.1999.3091
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402. https://doi.org/10.1093/nar/25.17.3389
Chou KC (2001) Prediction of protein cellular attributes using pseudo amino acid composition. Proteins 43(3):246–255. https://doi.org/10.1002/prot.1035
Chen C, Zhang Q, Yu B, Yu Z, Lawrence PJ, Ma Q, Zhang Y (2020) Improving protein-protein interactions prediction accuracy using XGBoost feature selection and stacked ensemble classifier. Comput Biol Med 123:103899. https://doi.org/10.1016/j.compbiomed.2020.103899
Cui X, Yu Z, Yu B, Wang M, Tian B, Ma Q (2019) UbiSitePred: a novel method for improving the accuracy of ubiquitination sites prediction by using LASSO to select the optimal Chou’s pseudo components. Chemom Intell Lab Syst 184:28–43. https://doi.org/10.1016/j.chemolab.2018.11.012
Yu B, Lou L, Li S, Zhang Y, Qiu W, Wu X, Wang M, Tian B (2017) Prediction of protein structural class for low-similarity sequences using Chou’s pseudo amino acid composition and wavelet denoising. J Mol Graph Model 76:260–273. https://doi.org/10.1016/j.jmgm.2017.07.012
Heffernan R, Yang Y, Paliwal K, Zhou Y (2017) Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics 33(18):2842–2849. https://doi.org/10.1093/bioinformatics/btx218
Yamanishi Y, Pauwels E, Saigo H, Stoven V (2011) Extracting sets of chemical substructures and protein domains governing drug-target interactions. J Chem Inf Model 51(5):1183–1194. https://doi.org/10.1021/ci100476q
Cao DS, Hu QN, Xu QS, Yang YN, Zhao JC, Lu HM, Zhang LX, Liang YZ (2011) In silico classification of human maximum recommended daily dose based on modified random forest and substructure fingerprint. Anal Chim Acta 692(1–2):50–56. https://doi.org/10.1016/j.aca.2011.02.010
O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open Babel: an open chemical toolbox. J Cheminform 3:33. https://doi.org/10.1186/1758-2946-3-33
Chawla NV, Bowyer KW, Kegelmeyer HLO, WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357. https://doi.org/10.1613/jair.953
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232. https://doi.org/10.1214/aos/1013203451
Chen C, Zhang Q, Ma Q, Yu B (2019) LightGBM-PPI: Predicting protein-protein interactions through LightGBM with multi-information fusion. Chemom Intell Lab Syst 191:54–64. https://doi.org/10.1016/j.chemolab.2019.06.003
Zhan ZH, You ZH, Li LP, Zhou Y, Yi HC (2018) Accurate prediction of ncRNA-Protein interactions from the integration of sequence and evolutionary information. Front Genet 9:458. https://doi.org/10.3389/fgene.2018.00458
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259. https://doi.org/10.1016/S0893-6080(05)80023-1
Mishra A, Pokhrel P, Hoque MT (2019) StackDPPred: a stacking based prediction of DNA-binding protein from sequence. Bioinformatics 35(3):433–441. https://doi.org/10.1093/bioinformatics/bty653
Wu H, Xing Y, Ge W, Liu X, Zou J, Zhou C, Liao J (2020) Drug-drug interaction extraction via hybrid neural networks on biomedical literature. J Biomed Inform 106:103432. https://doi.org/10.1016/j.jbi.2020.103432
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507. https://doi.org/10.1126/science.1127647
Mousavian Z, Khakabimamaghani S, Kavousi K, Masoudi-Nejad A (2016) Drug-target interaction prediction from PSSM based evolutionary information. J Pharmacol Toxicol Methods 78:42–51. https://doi.org/10.1016/j.vascn.2015.11.002
Wang X, Zhang Y, Yu B, Salhi A, Chen R, Wang L, Liu Z (2021) Prediction of protein-protein interaction sites through eXtreme gradient boosting with kernel principal component analysis. Comput Biol Med 134:104516. https://doi.org/10.1016/j.compbiomed.2021.104516
Yu B, Qiu W, Chen C, Ma A, Jiang J, Zhou H, Ma Q (2020) SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting. Bioinformatics 36(4):1074–1081. https://doi.org/10.1093/bioinformatics/btz734
Yu B, Yu Z, Chen C, Ma A, Liu B, Tian B, Ma Q (2020) DNNAce: Prediction of prokaryote lysine acetylation sites through deep neural networks with multi-information fusion. Chemom Intell Lab Syst 200:103999. https://doi.org/10.1016/j.chemolab.2020.103999
Sun X, Jin T, Chen C, Cui X, Ma Q, Yu B (2020) RBPro-RF: Use Chou’s 5-steps rule to predict RNA-binding proteins via random forest with elastic net. Chemom Intell Lab Syst 197:103919. https://doi.org/10.1016/j.chemolab.2019.103919
Wang M, Cui X, Li S, Yang X, Ma A, Zhang Y, Yu B (2020) DeepMal:accurate prediction of protein malonylation sites by deep neural networks. Chemom Intell Lab Syst 207:104175. https://doi.org/10.1016/j.chemolab.2020.104175
Liu XY, Wu J, Zhou ZH (2009) Exploratory Undersampling for Class-Imbalance Learning. IEEE Trans Syst Man Cybern B Cybern 39(2):539–550. https://doi.org/10.1109/TSMCB.2008.2007853
Bao L, Juan C, Li J, Zhang Y (2016) Boosted Near-miss Under-sampling on SVM ensembles for concept detection in large-scale imbalanced datasets. Neurocomputing 172:198–206. https://doi.org/10.1016/j.neucom.2014.05.096
Taguchi YH, Oono Y (2005) Relational patterns of gene expression via non-metric multidimensional scaling analysis. Bioinformatics 21(6):730–740. https://doi.org/10.1093/bioinformatics/bti067
Ross BC (2014) Mutual information between discrete and continuous data sets. PLoS ONE 9(2):e87357. https://doi.org/10.1371/journal.pone.0087357
Lai CM, Yeh WC, Chang CY (2016) Gene selection using information gain and improved simplified swarm optimization. Neurocomputing 218:331–338. https://doi.org/10.1016/j.neucom.2016.08.089
Wang Y, Tseng M (2014) Attribute selection for product configurator design based on Gini index. Int J Prod Res 52:6136–6145. https://doi.org/10.1080/00207543.2014.917216
Zou Q, Zeng J, Cao L, Ji R (2016) A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 173:346–354. https://doi.org/10.1016/j.neucom.2014.12.123
Kandaswamy KK, Pugalenthi G, Hazrati MK, Kalies KU, Martinetz T (2011) BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection. BMC Bioinform 12:345. https://doi.org/10.1186/1471-2105-12-345
Chen C, Shi H, Jiang Z, Salhi A, Chen R, Cui X, Yu B (2021) DNN-DTIs: Improved drug-target interactions prediction using XGBoost feature selection and deep neural network. Comput Biol Med 136:104676. https://doi.org/10.1016/j.compbiomed.2021.104676
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139. https://doi.org/10.1006/jcss.1997.1504
Nigsch F, Bender A, Buuren BV, Tissen J, Nigsch E, Mitchell JBO (2006) Melting point prediction employing k-nearest neighbor algorithms and genetic parameter optimization. J Chem Inf Model 46(6):2412–2422. https://doi.org/10.1021/ci060149f
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106. https://doi.org/10.1007/BF00116251
Box JF (1987) Guinness, Gosset, Fisher, and Small Samples. Stat Sci 2(1):45–52. https://doi.org/10.1214/ss/1177013437
Cao DS, Liu S, Xu QS, Lu HM, Huang JH, Hu QN, Liang YZ (2012) Large-scale prediction of drug-target interactions using proteinsequences and drug topological structures. Anal Chim Acta 752:1–10. https://doi.org/10.1016/j.aca.2012.09.021
Wang L, You ZH, Chen X, Xia SX, Liu F, Yan X, Zhou Y, Song KJ (2018) A computational-based method for predicting drug-target interactions by using stacked autoencoder deep neural network. J Comput Biol 25(3):361–373. https://doi.org/10.1089/cmb.2017.0135
Xia LY, Yang ZY, Zhang H, Liang Y (2019) Improved prediction of drug-target interactions using self-paced learning with collaborative matrix factorization. J Chem Inf Model 59(7):3340–3351. https://doi.org/10.1021/acs.jcim.9b00408
Meece FA, Ahmed G, Nair H, Santhamma B, Tekmal RR, Zhao C, Pollok NE, Lara J, Shaked Z, Nickisch K (2018) Esters of levonorgestrel and etonogestrel intended as single, subcutaneous-injection, long-lasting contraceptives. Steroids 137:47–56. https://doi.org/10.1016/j.steroids.2018.07.010
Radin DP, Patel P (2016) Delineating the molecular mechanisms of tamoxifen’s oncolytic actions in estrogen receptor-negative cancers. Eur J Pharmacol 781:173–180. https://doi.org/10.1016/j.ejphar.2016.04.017
Gainder S, Thakur M, Saha SC, Prakash M (2019) To study the changes in fetal hemodynamics with intravenous labetalol or nifedipine in acute severe hypertension. Pregnancy Hypertens 15:12–15. https://doi.org/10.1016/j.preghy.2018.02.011
Ferrari MD, Saxena PRS (1992) Clinical effects and mechanism of action of sumatriptan in migraine. Clin Neurol Neurosur 94:73–77. https://doi.org/10.1016/0303-8467(92)90028-2
Matabosch X, Pozo OJ, Monfort N, Pérez-Mañá C, Farré M, Marcos J, Segura J, Ventura R (2013) Urinary profile of methylprednisolone and its metabolites after oral and topical administrations. J Steroid Biochem 138:214–221. https://doi.org/10.1016/j.jsbmb.2013.05.019