Artificial intelligence for prediction of biological activities and generation of molecular hits using stereochemical information

Tiago Pereira1, Maryam Abbasi1, Rita I Oliveira2,3, Romina A. Guedes2,3, Jorge A. R. Salvador2,3, Joel P. Arrais1
1Centre for Informatics and Systems, Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal
2Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, Coimbra, Portugal.
3Laboratory of Pharmaceutical Chemistry, Faculty of Pharmacy, University of Coimbra, Coimbra, Portugal.

Tóm tắt

AbstractIn this work, we develop a method for generating targeted hit compounds by applying deep reinforcement learning and attention mechanisms to predict binding affinity against a biological target while considering stereochemical information. The novelty of this work is a deep model Predictor that can establish the relationship between chemical structures and their corresponding $$pIC_{50}$$ p I C 50 values. We thoroughly study the effect of different molecular descriptors such as ECFP4, ECFP6, SMILES and RDKFingerprint. Also, we demonstrated the importance of attention mechanisms to capture long-range dependencies in molecular sequences. Due to the importance of stereochemical information for the binding mechanism, this information was employed both in the prediction and generation processes. To identify the most promising hits, we apply the self-adaptive multi-objective optimization strategy. Moreover, to ensure the existence of stereochemical information, we consider all the possible enumerated stereoisomers to provide the most appropriate 3D structures. We evaluated this approach against the Ubiquitin-Specific Protease 7 (USP7) by generating putative inhibitors for this target. The predictor with SMILES notations as descriptor plus bidirectional recurrent neural network using attention mechanism has the best performance. Additionally, our methodology identify the regions of the generated molecules that are important for the interaction with the receptor’s active site. Also, the obtained results demonstrate that it is possible to discover synthesizable molecules with high biological affinity for the target, containing the indication of their optimal stereochemical conformation.

Từ khóa


Tài liệu tham khảo

Wouters OJ, McKee M, Luyten J (2020) Estimated research and development investment needed to bring a new medicine to market, 2009–2018. JAMA 323(9):844–853

Cui W, Aouidate A, Wang S, Yu Q, Li Y, Yuan S (2020) Discovering anti-cancer drugs via computational methods. Front Pharmacol 11:733. https://doi.org/10.3389/fphar.2020.00733

Ban F, Dalal K, Li H, LeBlanc E, Rennie PS, Cherkasov A (2017) Best practices of computer-aided drug discovery: lessons learned from the development of a preclinical candidate for prostate cancer with a new mechanism of action. J Chem Inf Model 57(5):1018–1028

Pedreira JG, Franco LS, Barreiro EJ (2019) Chemical intuition in drug design and discovery. Curr Top Med Chem 19(19):1679–1693

Bleicher KH, Böhm H-J, Müller K, Alanine AI (2003) Hit and lead generation: beyond high-throughput screening. Nat Rev Drug Discov 2(5):369–378

Shen W-F, Tang H-W, Li J-B, Li X, Chen S (2023) Multimodal data fusion for supervised learning-based identification of USP7 inhibitors: a systematic comparison. J Cheminform 15(1):5. https://doi.org/10.1186/s13321-022-00675-8

Saikia S, Bordoloi M (2019) Molecular docking: challenges, advances and its use in drug discovery perspective. Curr Drug Targets 20(5):501–521

Xue W, Yang F, Wang P, Zheng G, Chen Y, Yao X, Zhu F (2018) What contributes to serotonin-norepinephrine reuptake inhibitors’ dual-targeting mechanism? the key role of transmembrane domain 6 in human serotonin and norepinephrine transporters revealed by molecular dynamics simulation. ACS Chem Neurosci 9(5):1128–1140

Brown N, Lewis RA (2006) Exploiting qsar methods in lead optimization. Curr Opin Drug Discov Dev 9(4):419–424

Spiegel JO, Durrant JD (2020) Autogrow4: an open-source genetic algorithm for de novo drug design and lead optimization. J Cheminform 12(1):1–16

Popova M, Isayev O, Tropsha A (2018) Deep reinforcement learning for de novo drug design. Sci Adv 4(7):7885

Ståhl N, Falkman G, Karlsson A, Mathiason G, Bostrom J (2019) Deep reinforcement learning for multiparameter optimization in de novo drug design. J Chem Inf Model 59(7):3166–3176

Bian Y, Xie X-Q (2021) Generative chemistry: drug discovery with deep learning generative models. J Mol Model 27(3):1–18

Elton DC, Boukouvalas Z, Fuge MD, Chung PW (2019) Deep learning for molecular design-a review of the state of the art. Mol Syst Des Eng 4(4):828–849

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems. Accessed 6 Oct 2022

Guimaraes GL, Sanchez-Lengeling B, Outeiral C, Farias PLC, Aspuru-Guzik A (2017) Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models. arXiv. https://doi.org/10.48550/arxiv.1705.10843. Accessed 28 June 2022

Méndez-Lucio O, Baillif B, Clevert D-A, Rouquié D, Wichard J (2020) De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. Nat Commun 11(1):1–10

Gómez-Bombarelli R, Wei JN, Duvenaud D, Hernández-Lobato JM, Sánchez-Lengeling B, Sheberla D, Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4(2):268–276

Pereira T, Abbasi M, Ribeiro B, Arrais JP (2021) Diversity oriented deep reinforcement learning for targeted molecule generation. J Cheminform 13(1):1–17

Oliveira RI, Guedes RA, Salvador JA (2022) Highlights in USP7 inhibitors for cancer treatment. Front Chem 10:1005727

Chen S, Liu Y, Zhou H (2021) Advances in the development ubiquitin-specific peptidase (USP) inhibitors. Int J Mol Sci 22(9):4546. https://doi.org/10.3390/ijms22094546

Wu J, Kumar S, Wang F, Wang H, Chen L, Arsenault P, Mattern M, Weinstock J (2018) Chemical approaches to intervening in ubiquitin specific protease 7 (USP7) function for oncology and immune oncology therapies. J Med Chem 61(2):422–443. https://doi.org/10.1021/acs.jmedchem.7b00498

Santos BP, Abbasi M, Pereira T, Ribeiro B, Arrais JP (2021) Optimizing recurrent neural network architectures for de novo drug design. In: 2021 IEEE 34th international symposium on computer-based medical systems (CBMS), pp 172–177. https://doi.org/10.1109/CBMS52027.2021.00067

Benhenda M (2017) Chemgan challenge for drug discovery: can AI reproduce natural chemical diversity? arXiv preprint arXiv:1708.08227

Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742–754

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge

Wang Z, Kang W, You Y, Pang J, Ren H, Suo Z, Liu H, Zheng Y (2019) USP7: novel drug target in cancer therapy. Front Pharmacol 10:427

Yuan T, Yan F, Ying M, Cao J, He Q, Zhu H, Yang B (2018) Inhibition of ubiquitin-specific proteases as a novel anticancer therapeutic strategy. Front Pharmacol 9:1080. https://doi.org/10.3389/fphar.2018.01080

Gallo L, Ko J, Donoghue D (2017) The importance of regulatory ubiquitination in cancer and metastasis. Cell Cycle 16(7):634–648

Ertl P, Schuffenhauer A (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminform 1(1):1–11

Nguyen TT, Nguyen ND, Vamplew P, Nahavandi S, Dazeley R, Lim CP (2020) A multi-objective deep reinforcement learning framework. Eng Appl Artif Intell 96:103915

Inc CCG (2016) Molecular operating environment (MOE). Chemical Computing Group Inc., Montreal

Brown N, Fiscato M, Segler MHS, Vaucher AC (2019) Guacamol: benchmarking models for de novo molecular design. J Chem Inf Model 59(3):1096–1108. https://doi.org/10.1021/acs.jcim.8b00839

Segler MHS, Kogej T, Tyrchan C, Waller MP (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4(1):120–131. https://doi.org/10.1021/acscentsci.7b00512

Polykovskiy D, Zhebrak A, Vetrov D, Ivanenkov Y, Aladinskiy V, Mamoshina P, Bozdaganyan M, Aliper A, Zhavoronkov A, Kadurin A (2018) Entangled conditional adversarial autoencoder for de novo drug discovery. Mol Pharm 15(10):4398–4405. https://doi.org/10.1021/acs.molpharmaceut.8b00839

Jensen JH (2019) A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space. Chem Sci (Royal Society of Chemistry 2010) 10(12):3567–3572. https://doi.org/10.1039/c8sc05372c