Prediction of synergistic drug combinations using PCA-initialized deep learning

BioData Mining - Tập 14 - Trang 1-15 - 2021
Jun Ma1,2, Alison Motsinger-Reif2
1Bioinformatics Research Center, North Carolina State University, Raleigh, USA
2Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Durham, USA

Tóm tắt

Cancer is one of the main causes of death worldwide. Combination drug therapy has been a mainstay of cancer treatment for decades and has been shown to reduce host toxicity and prevent the development of acquired drug resistance. However, the immense number of possible drug combinations and large synergistic space makes it infeasible to screen all effective drug pairs experimentally. Therefore, it is crucial to develop computational approaches to predict drug synergy and guide experimental design for the discovery of rational combinations for therapy. We present a new deep learning approach to predict synergistic drug combinations by integrating gene expression profiles from cell lines and chemical structure data. Specifically, we use principal component analysis (PCA) to reduce the dimensionality of the chemical descriptor data and gene expression data. We then propagate the low-dimensional data through a neural network to predict drug synergy values. We apply our method to O’Neil’s high-throughput drug combination screening data as well as a dataset from the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge. We compare the neural network approach with and without dimension reduction. Additionally, we demonstrate the effectiveness of our deep learning approach and compare its performance with three state-of-the-art machine learning methods: Random Forests, XGBoost, and elastic net, with and without PCA-based dimensionality reduction. Our developed approach outperforms other machine learning methods, and the use of dimension reduction dramatically decreases the computation time without sacrificing accuracy.

Tài liệu tham khảo

Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424. https://doi.org/10.3322/caac.21492. Burk DL, Lemley MA. The patent crisis and how the courts can solve it: University of Chicago Press; 2009. DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ. 2016;47:20–33. https://doi.org/10.1016/j.jhealeco.2016.01.012. Van Norman GA. Drugs, devices, and the FDA: part 1: an overview of approval processes for drugs. JACC Basic Transl Sci. 2016;1(3):170–9. https://doi.org/10.1016/j.jacbts.2016.03.002. Csermely P, Korcsmáros T, Kiss HJ, London G, Nussinov R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther. 2013;138(3):333–408. https://doi.org/10.1016/j.pharmthera.2013.01.016. De Clercq E. The design of drugs for HIV and HCV. Nat Rev Drug Discov. 2007;6(12):1001–18. https://doi.org/10.1038/nrd2424. DeVita VT, Schein PS. The use of drugs in combination for the treatment of cancer: rationale and results. N Engl J Med. 1973;288(19):998–1006. https://doi.org/10.1056/NEJM197305102881905. Humphrey RW, Brockway-Lunardi LM, Bonk DT, Dohoney KM, Doroshow JH, Meech SJ, et al. Opportunities and challenges in the development of experimental drug combinations for cancer. J Natl Cancer Inst. 2011;103(16):1222–6. https://doi.org/10.1093/jnci/djr246. Larder BA, Kemp SD, Harrigan PR. Potential mechanism for sustained antiretroviral efficacy of AZT-3TC combination therapy. Science. 1995;269(5224):696–9. https://doi.org/10.1126/science.7542804. Jia J, Zhu F, Ma X, Cao ZW, Li YX, Chen YZ. Mechanisms of drug combinations: interaction and network perspectives. Nat Rev Drug Discov. 2009;8(2):111–28. https://doi.org/10.1038/nrd2683. Lopez JS, Banerji U. Combine and conquer: challenges for targeted therapy combinations in early phase trials. Nat Rev Clin Oncol. 2017;14(1):57–66. https://doi.org/10.1038/nrclinonc.2016.96. Tallarida RJ. Quantitative methods for assessing drug synergism. Genes Cancer. 2011;2(11):1003–8. https://doi.org/10.1177/1947601912440575. Holbeck SL, Camalier R, Crowell JA, Govindharajulu JP, Hollingshead M, Anderson LW, et al. The National Cancer Institute ALMANAC: a comprehensive screening resource for the detection of anticancer drug pairs with enhanced therapeutic activity. Cancer Res. 2017;77(13):3564–76. https://doi.org/10.1158/0008-5472.CAN-17-0489. Menden MP, Wang D, Mason MJ, Szalai B, Bulusu KC, Guan Y, et al. Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen. Nat Commun. 2019;10(1):2674. https://doi.org/10.1038/s41467-019-09799-2. O'Neil J, Benita Y, Feldman I, Chenard M, Roberts B, Liu Y, et al. An unbiased oncology compound screen to identify novel combination strategies. Mol Cancer Ther. 2016;15(6):1155–62. https://doi.org/10.1158/1535-7163.MCT-15-0843. Zhang L, Yan K, Zhang Y, Huang R, Bian J, Zheng C, et al. High-throughput synergy screening identifies microbial metabolites as combination agents for the treatment of fungal infections. Proc Natl Acad Sci U S A. 2007;104(11):4606–11. https://doi.org/10.1073/pnas.0609370104. Feala JD, Cortes J, Duxbury PM, Piermarocchi C, McCulloch AD, Paternostro G. Systems approaches and algorithms for discovery of combinatorial therapies. Wiley Interdiscip Rev Syst Biol Med. 2010;2(2):181–93. https://doi.org/10.1002/wsbm.51. Pang K, Wan YW, Choi WT, Donehower LA, Sun J, Pant D, et al. Combinatorial therapy discovery using mixed integer linear programming. Bioinformatics. 2014;30(10):1456–63. https://doi.org/10.1093/bioinformatics/btu046. Sun X, Bao J, You Z, Chen X, Cui J. Modeling of signaling crosstalk-mediated drug resistance and its implications on drug combination. Oncotarget. 2016;7(39):63995–4006. https://doi.org/10.18632/oncotarget.11745. Li P, Huang C, Fu Y, Wang J, Wu Z, Ru J, et al. Large-scale exploration and analysis of drug combinations. Bioinformatics. 2015;31(12):2007–16. https://doi.org/10.1093/bioinformatics/btv080. Wildenhain J, Spitzer M, Dolma S, Jarvik N, White R, Roy M, et al. Prediction of synergism from chemical-genetic interactions by machine learning. Cell Syst. 2015;1(6):383–95. https://doi.org/10.1016/j.cels.2015.12.003. Preuer K, Lewis RPI, Hochreiter S, Bender A, Bulusu KC, Klambauer G. DeepSynergy: predicting anti-cancer drug synergy with deep learning. Bioinformatics. 2018;34(9):1538–46. https://doi.org/10.1093/bioinformatics/btx806. Resat H, Petzold L, Pettigrew MF. Kinetic modeling of biological systems. Methods Mol Biol. 2009;541:311–35. https://doi.org/10.1007/978-1-59745-243-4_14. Johnstone IM, Titterington DM. Statistical challenges of high-dimensional data. Philos Trans A Math Phys Eng Sci. 2009;367(1906):4237–53. https://doi.org/10.1098/rsta.2009.0159. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44. https://doi.org/10.1038/nature14539. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018;19(6):1236–46. https://doi.org/10.1093/bib/bbx044. Farabet C, Couprie C, Najman L, Lecun Y. Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1915–29. https://doi.org/10.1109/TPAMI.2012.231. Hinton G, Deng L, Yu D, Dahl GE. Mohamed A-r, Jaitly N, et al. deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag. 2012;29(6):82–97. https://doi.org/10.1109/MSP.2012.2205597. Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. IEEE Comput Intell M. 2018;13(3):55–75. https://doi.org/10.1109/MCI.2018.2840738. Belkin M, Hsu D, Ma S, Mandal S. Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proc Natl Acad Sci U S A. 2019;116(32):15849–54. https://doi.org/10.1073/pnas.1903070116. Neal B, Mittal S, Baratin A, Tantia V, Scicluna M, Lacoste-Julien S, et al. A modern take on the bias-variance tradeoff in neural networks. arXiv preprint arXiv:181008591. 2018. Pearson K. On lines and planes of closest fit to systems of points in space. London, Edinburgh, Dublin Phil Mag J Sci. 1901;2(11):559–72. https://doi.org/10.1080/14786440109462720. Sakurada M, Yairi T, editors. Anomaly detection using autoencoders with nonlinear dimensionality reduction. Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis; 2014. Kramer MA. Nonlinear principal component analysis using autoassociative neural networks. AICHE J. 1991;37(2):233–43. https://doi.org/10.1002/aic.690370209. Di Veroli GY, Fornari C, Wang D, Mollard S, Bramhall JL, Richards FM, et al. Combenefit: an interactive platform for the analysis and visualization of drug combinations. Bioinformatics. 2016;32(18):2866–8. https://doi.org/10.1093/bioinformatics/btw230. Roell KR, Reif DM, Motsinger-Reif AA. An introduction to terminology and methodology of chemical synergy-perspectives from across disciplines. Front Pharmacol. 2017;8:158. https://doi.org/10.3389/fphar.2017.00158. Greco WR, Bravo G, Parsons JC. The search for synergy: a critical review from a response surface perspective. Pharmacol Rev. 1995;47(2):331–85. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30. Chollet F. Keras 2015 [Available from: https://github.com/fchollet/keras]. Bouzerdoum A, editor A new class of high-order neural networks with nonlinear decision boundaries. ICONIP'99 ANZIIS'99 & ANNES'99 & ACNN'99 6th International Conference on Neural Information Processing Proceedings (Cat No 99EX378); 1999: IEEE. Motsinger-Reif AA, Dudek SM, Hahn LW, Ritchie MD. Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genet Epidemiol. 2008;32(4):325–40. https://doi.org/10.1002/gepi.20307. Zou H, Hastie T. Regularization and variable selection via the elastic net. J Roy Stat Soc B. 2005;67(2):301–20. https://doi.org/10.1111/j.1467-9868.2005.00503.x. Ho TK, editor Random decision forests. Proceedings of 3rd International Conference on Document Analysis and Recognition; 1995: IEEE. Chen T, Guestrin C, editors. Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016. Pemovska T, Bigenzahn JW, Superti-Furga G. Recent advances in combinatorial drug screening and synergy scoring. Curr Opin Pharmacol. 2018;42:102–10. https://doi.org/10.1016/j.coph.2018.07.008. Jack J, Rotroff D, Motsinger-Reif A. Lymphoblastoid cell lines models of drug response: successes and lessons from this pharmacogenomic model. Curr Mol Med. 2014;14(7):833–40. https://doi.org/10.2174/1566524014666140811113946.