Genetic algorithm optimization in drug design QSAR: Bayesian-regularized genetic neural networks (BRGNN) and genetic algorithm-optimized support vectors machines (GA-SVM)
Tóm tắt
Từ khóa
Tài liệu tham khảo
Gasteiger J (2006) Chemoinformatics: a new field with a long tradition. Anal Bioanal Chem 384: 57–64. doi: 10.1007/s00216-005-0065-y
Cramer RD, Patterson DE, Bunce JD (1988) Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc 110: 5959–5967. doi: 10.1021/ja00226a005
Klebe G, Abraham U, Mietzner T (1994) Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity. J Med Chem 37: 4130–4146. doi: 10.1021/jm00050a010
Folkers G, Merz A, Rognan D (1993) CoMFA: scope and limitations. In: Kubinyi H (eds) 3D-QSAR in drug design. Theory, methods and applications. ESCOM Science Publishers BV, Leiden, pp 583–618
Hansch C, Kurup A, Garg R, Gao H (2001) Chem-bioinformatics and QSAR: a review of QSAR lacking positive hydrophobic terms. Chem Rev 101: 619–672. doi: 10.1021/cr0000067
Sabljic A (1990) Topological indices and environmental chemistry. In: Karcher W, Devillers J (eds) Practical applications of quantitative structure–activity relationships (QSAR) in environmental chemistry and toxicology. Kluwer, Dordrecht, pp 61–82
Karelson M, Lobanov VS, Katritzky AR (1996) Quantum-chemical descriptors in QSAR/QSPR studies. Chem Rev 96: 1027–1043. doi: 10.1021/cr950202r
Livingstone DJ, Manallack DT, Tetko IV (1997) Data modelling with neural networks: advantages and limitations. J Comput Aid Mol Des 11: 135–142. doi: 10.1023/A:1008074223811
Burbidge R, Trotter M, Buxton B, Holden S (2001) Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput Chem 26: 5–14. doi: 10.1016/S0097-8485(01)00094-8
Caballero J, Fernández M (2006) Linear and non-linear modeling of antifungal activity of some heterocyclic ring derivatives using multiple linear regression and Bayesian-regularized neural networks. J Mol Model 12: 168–181. doi: 10.1007/s00894-005-0014-x
Holland H (1975) Adaption in natural and artificial systems. The University of Michigan Press, Ann Arbor, MI
Cartwright HM (1993) Applications of artificial intelligence in chemistry. Oxford University Press, Oxford
Cho SJ, Hermsmeier MA (2002) Genetic algorithm guided selection: variable selection and subset selection. J Chem Inf Comput Sci 42: 927–936. doi: 10.1021/ci010247v
Duchowicz PR, Vitale MG, Castro EA, Fernandez M, Caballero J (2007) QSAR analysis for heterocyclic antifungals. Bioorg Med Chem 15: 2680–2689. doi: 10.1016/j.bmc.2007.01.039
Fernández M, Caballero J, Fernández L, Abreu JI, Garriga M (2007) Protein radial distribution function (P-RDF) and Bayesian-regularized genetic neural networks for modeling protein conformational stability: chymotrypsin inhibitor 2 mutants. J Mol Graph Model 26: 748–759. doi: 10.1016/j.jmgm.2007.04.011
Caballero J, Garriga M, Fernández M (2005) Genetic neural network modeling of the selective inhibition of the intermediate-conductance Ca2+-activated K+ channel by some triarylmethanes using topological charge indexes descriptors. J Comput Aid Mol Des 19: 771–789. doi: 10.1007/s10822-005-9025-z
Caballero J, Garriga M, Fernández M (2006) 2D Autocorrelation modeling of the negative inotropic activity of calcium entry blockers using Bayesian-regularized genetic neural networks. Bioorg Med Chem 14: 3330–3340. doi: 10.1016/j.bmc.2005.12.048
Caballero J, Tundidor-Camba A, Fernández M (2007) Modeling of the inhibition constant (K i ) of some Cruzain ketone-based inhibitors using 2D spatial autocorrelation vectors and data-diverse ensembles of Bayesian-regularized genetic neural networks. QSAR Comb Sci 26: 27–40. doi: 10.1002/qsar.200610001
Fernández M, Tundidor-Camba A, Caballero J (2005) Modeling of cyclin-dependent kinase inhibition by 1H-pyrazolo [3,4-d] pyrimidine derivatives using artificial neural networks ensembles. J Chem Inf Model 45: 1884–1895. doi: 10.1021/ci050263i
Fernández M, Caballero J (2006) Bayesian-regularized genetic neural networks applied to the modeling of non-peptide antagonists for the human luteinizing hormone-releasing hormone receptor. J Mol Graph Model 25: 410–422. doi: 10.1016/j.jmgm.2006.02.005
Fernandez M, Carreiras MC, Marco JL, Caballero J (2006) Modeling of acetylcholinesterase inhibition by tacrine analogues using Bayesian-regularized genetic neural networks and ensemble averaging. J Enzym Inhib Med Chem 21: 647–661. doi: 10.1080/14756360600862366
Fernández M, Caballero J (2006) Modeling of activity of cyclic urea HIV-1 protease inhibitors using regularized-artificial neural networks. Bioorg Med Chem 14: 280–294. doi: 10.1016/j.bmc.2005.08.022
Fernández M, Caballero J, Tundidor-Camba A (2006) Linear and nonlinear QSAR study of N-hydroxy-2-[(phenylsulfonyl)amino]acetamide derivatives as matrix metalloproteinase inhibitors. Bioorg Med Chem 14: 4137–4150. doi: 10.1016/j.bmc.2006.01.072
Fernández M, Caballero J (2006) Ensembles of Bayesian-regularized genetic neural networks for modeling of acetylcholinesterase inhibition by huprines. Chem Biol Drug Des 68: 201–212. doi: 10.1111/j.1747-0285.2006.00435.x
González MP, Caballero J, Tundidor-Camba A, Helguera AM, Fernández M (2006) Modeling of farnesyltransferase inhibition by some thiol and non-thiol peptidomimetic inhibitors using genetic neural networks and RDF approaches. Bioorg Med Chem 14: 200–213. doi: 10.1016/j.bmc.2005.08.009
Di Fenza A, Alagona G, Ghio C, Leonardi R, Giolitti A, Madami A (2007) Caco-2 cell permeability modelling: a neural network coupled genetic algorithm approach. J Comput Aid Mol Des 21: 207–221. doi: 10.1007/s10822-006-9098-3
So S, Karplus M (1996) Evolutionary optimization in quantitative structure–activity relationship: an application of genetic neural networks. J Med Chem 39: 1521–1530. doi: 10.1021/jm9507035
Gao H (2001) Application of BCUT metrics and genetic algorithm in binary QSAR analysis. J Chem Inf Comput Sci 41: 402–407. doi: 10.1021/ci000306p
Fernández M, Fernández L, Abreu JI, Garriga M (2008) Classification of voltage-gated K(+) ion channels from 3D pseudo-folding graph representation of protein sequences using genetic algorithm-optimized support vector machines. J Mol Graph Model 26: 1306–1314. doi: 10.1016/j.jmgm.2008.01.001
Caballero J, Fernández L, Garriga M, Abreu JI, Collina S, Fernández M (2007) Proteometric study of ghrelin receptor function variations upon mutations using amino acid sequence autocorrelation vectors and genetic algorithm-based least square support vector machines. J Mol Graph Model 26: 166–178. doi: 10.1016/j.jmgm.2006.11.002
Hemmateenejad B, Miri R, Akhond M, Shamsipur M (2002) QSAR study of the calcium channel antagonist activity of some recently synthesized dihydropyridine derivatives. An application of genetic algorithm for variable selection in MLR and PLS methods. Chemom Intell Lab 64: 91–99. doi: 10.1016/S0169-7439(02)00068-0
Hemmateenejad B, Akhond M, Miri R, Shamsipur M (2003) Genetic algorithm applied to the selection of factors in principal component-artificial neural networks: application to QSAR study of calcium channel antagonist activity of 1,4-dihydropyridines (nifedipine analogous). J Chem Inf Comput Sci 43: 1328–1334. doi: 10.1021/ci025661p
Hemmateenejad B (2004) Optimal QSAR analysis of the carcinogenic activity of drugs by correlation ranking and genetic algorithm-based PCR. J Chemom 18: 475–485. doi: 10.1002/cem.891
Yamashita F, Wanchana S, Hashida M (2002) Quantitative structure/property relationship analysis of caco-2 permeability using a genetic algorithm-based partial least squares method. J Pharm Sci 91: 2230–2238. doi: 10.1002/jps.10214
Selwood DL, Livingstone DJ, Comley JCW, O’Dowd AB, Hudson AT, Jackson P, Jandu KS, Rose VS, Stables JN (1990) Structure–activity relationships of antifilarial antimycin analogues: a multivariate pattern recognition study. J Med Chem 33: 136–142. doi: 10.1021/jm00163a023
Ren Y, Liu H, Li S, Yao X, Liu M (2007) Prediction of binding affinities to b1 isoform of human thyroid hormone receptor by genetic algorithm and projection pursuit regression. Bioorg Med Chem Lett 17: 2474–2482. doi: 10.1016/j.bmcl.2007.02.025
Turner DB, Willett P (2000) Evaluation of the EVA descriptor for QSAR studies: 3. The use of a genetic algorithm to search for models with enhanced predictive properties (EVA_GA). J Comput Aid Mol Des 14: 1–21. doi: 10.1023/A:1008180020974
Xue L, Bajorath J (2000) Molecular descriptors for effective classification of biologically active compounds based on principal component analysis identified by a genetic algorithm. J Chem Inf Comput Sci 40: 801–809. doi: 10.1021/ci000322m
Kamphausen S, Höltge N, Wirsching F, Morys-Wortmann C, Riester D, Goetz R, Thürk M, Schwienhorst A (2002) Genetic algorithm for the design of molecules with desired properties. J Comput Aid Mol Des 16: 551–567. doi: 10.1023/A:1021928016359
Guo W, Cai W, Shao X, Pan Z (2005) Application of genetic stochastic resonance algorithm to quantitative structure–activity relationship study. Chemom Intell Lab 75: 181–188. doi: 10.1016/j.chemolab.2004.07.004
Teixido M, Belda I, Rosello X, Gonzalez S, Fabrec M, Llora X, Bacardite J, Garrelle JM, Vilaro S, Albericio F, Giralta E (2003) Development of a genetic algorithm to design and identify peptides that can cross the blood–brain barrier 1. Design and validation in silico. QSAR Comb Sci 22: 745–753. doi: 10.1002/qsar.200320004
So SS, Karplus M (1997) Three-dimensional quantitative structure–activity relationships from molecular similarity matrices and genetic neural networks: 1. Method and validations. J Med Chem 40: 4347–4359. doi: 10.1021/jm970487v
So SS, Karplus M (1997) Three-dimensional quantitative structure–activity relationships from molecular similarity matrices and genetic neural networks: 2. Applications. J Med Chem 40: 4360–4371. doi: 10.1021/jm970488n
Chiu TL, So SS (2003) Genetic neural networks for functional approximation. QSAR Comb Sci 22: 519–526. doi: 10.1002/qsar.200310004
Patankar SJ, Jurs PC (2000) Prediction of IC50 values for ACAT inhibitors from molecular structure. J Chem Inf Comput Sci 40: 706–723. doi: 10.1021/ci990125r
Kauffman GW, Jurs PC (2000) Prediction of inhibition of the sodium ion-proton antiporter by benzoylguanidine derivatives from molecular structure. J Chem Inf Comput Sci 40: 753–761. doi: 10.1021/ci9901237
Kauffman GW, Jurs PC (2001) QSAR and k-nearest neighbor classification analysis of selective cyclooxygenase-2 inhibitors using topologically-based numerical descriptors. J Chem Inf Comput Sci 41: 1553–1560. doi: 10.1021/ci010073h
Mattioni BE, Jurs PC (2002) Development of quantitative structure–activity relationship and classification models for a set of carbonic anhydrase inhibitors. J Chem Inf Comput Sci 42: 94–102. doi: 10.1021/ci0100696
Bakken GA, Jurs PC (2001) QSARs for 6-azasteroids as inhibitors of human type 1 5alpha-reductase: prediction of binding affinity and selectivity relative to 3-BHSD. J Chem Inf Comput Sci 41: 1255–1265. doi: 10.1021/ci010036q
Patankar SJ, Jurs PC (2002) Prediction of glycine/NMDA receptor antagonist inhibition from molecular structure. J Chem Inf Comput Sci 42: 1053–1068. doi: 10.1021/ci010114+
Burden FR, Winkler DA (1999) Robust QSAR models using Bayesian regularized neural networks. J Med Chem 42: 3183–3187. doi: 10.1021/jm980697n
Winkler DA, Burden R (2004) Bayesian neural nets for modeling in drug discovery. Biosilico 2: 104–111. doi: 10.1016/S1741-8364(04)02393-5
MATLAB 7.0. Program (2004) MathWorks Inc., Natick. http://www.mathworks.com
The MathWorks Inc: (2004) Genetic algorithm and direct search toolbox user’s guide for use with MATLAB. The Mathworks Inc., Natick
The MathWorks Inc.: (2004) Neural network toolbox user’s guide for use with MATLAB. The Mathworks Inc., Natick
Mackay DJC (1992) A practical Bayesian framework for backpropagation networks. Neural Comput 4: 448–472. doi: 10.1162/neco.1992.4.3.448
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20: 273–297
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discipl 2: 1–47. doi: 10.1023/A:1009715923555
Fröhlich H, Chapelle O, Schölkopf B (2003) Feature selection for support vector machines by means of genetic algorithms. In: Proceedings of 15th IEEE international conference on tools with AI, Sacramento, CA, USA, pp 142–148. doi: 10.1109/TAI.2003.1250182
Yang SY, Huang Q, Li LL, Ma CY, Zhang H, Bai R, Teng QZ, Xiang ML, Wei YQ (2009) An integrated scheme for feature selection and parameter setting in the support vector machine modeling and its application to the prediction of pharmacokinetic properties of drugs. Artif Intell Med 46: 155–163. doi: 10.1016/j.artmed.2008.07.001
Ma CY, Yang SY, Zhang H, Xiang ML, Huang Q, Wei YQ (2008) Prediction models of human plasma protein binding rate and oral bioavailability derived by using GA-CG-SVM method. J Pharmaceut Biomed 47: 677–682. doi: 10.1016/j.jpba.2008.03.023
Zhang H, Chen QY, Xiang ML, Ma CY, Huang Q, Yang SY (2009) in silico prediction of mitochondrial toxicity by using GA-CG-SVM approach. Toxicol in Vitro 23:134–140. doi: 10.1016/j.tiv.2008.09.017
Chih-Chung C, Chih-Jen L (2001) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm
Golbraikh A, Tropsha A (2002) Beware of q2!. J Mol Graph Model 20: 269–276. doi: 10.1016/S1093-3263(01)00123-1
Afantitis A, Melagraki G, Sarimveis H, Igglessi-Markopoulou O, Kollias G (2009) A novel QSAR model for predicting the inhibition of CXCR3 receptor by 4-N-aryl-[1,4] diazepane ureas. Eur J Med Chem 44: 877–884. doi: 10.1016/j.ejmech.2008.05.028
Agrafiotis DK, Cedeño W, Lobanov VS (2002) On the use of neural network ensembles in QSAR and QSPR. J Chem Inf Comput Sci 42: 903–911. doi: 10.1021/ci0203702
Caballero J, Fernández L, Abreu JI, Fernández M (2006) Amino acid sequence autocorrelation vectors and ensembles of Bayesian-regularized genetic neural networks for prediction of conformational stability of human lysozyme mutants. J Chem Inf Model 46: 1255–1268. doi: 10.1021/ci050507z
Fernández L, Caballero J, Abreu JI, Fernández M (2007) Amino acid sequence autocorrelation vectors and bayesian-regularized genetic neural networks for modeling protein conformational stability: gene V protein mutants. Proteins 67: 834–852. doi: 10.1002/prot.21349
MOPAC 6.0. (1993) Frank J. Seiler Research Laboratory. US Air Force Academy, Springs, CO
Fernández M, Caballero J (2007) QSAR models for predicting the activity of non-peptide luteinizing hormone-releasing hormone (LHRH) antagonists derived from erythromycin A using quantum chemical properties. J Mol Model 13: 465–476. doi: 10.1007/s00894-006-0163-6
Fernández M, Caballero J (2007) QSAR modeling of matrix metalloproteinase inhibition by N-hydroxy-α-phenylsulfonylacetamide derivatives. Bioorg Med Chem 15: 6298–6310. doi: 10.1016/j.bmc.2007.06.014
Fatemi MH, Gharaghani S (2007) A novel QSAR model for prediction of apoptosis-inducing activity of 4-aryl-4-H-chromenes based on support vector machine. Bioorg Med Chem 15: 7746–7754. doi: 10.1016/j.bmc.2007.08.057
Todeschini R, Consonni V, Pavan M (2002) DRAGON, version 2.1. Talete SRL, Milan, Italy
Cerius2, Version 4.11, http://www.accelrys.com
VCCLAB, Virtual Computational Chemistry Laboratory (2005) http://www.vcclab.org
Fernandez M, Abreu JI (2006) PROTMETRICS; version 1.0. Molecular Modeling Group University of Matanzas, Matanzas, Cuba
Kawashima S, Kanehisa M (2000) AAindex: amino acid index database. Nucleic Acids Res 28: 374–374. doi: 10.1093/nar/28.1.374
Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43: 59–69. doi: 10.1007/BF00337288
Dykens JA, Will Y (2007) The significance of mitochondrial toxicity testing in drug development. Drug Discov Today 12: 777–785. doi: 10.1016/j.drudis.2007.07.013
Foye WO (1995) Cancer chemotherapeutic agents. American Chemical Society, Washington, DC
Ashkenazi A, Dixit VM (1998) Death receptors: signaling and modulation. Science 281: 1305–1308. doi: 10.1126/science.281.5381.1305
Bartus RT, Dean RL, Beer B, Lippa AS (1982) The cholinergic hypothesis of geriatric memory dysfunction. Science 217: 408–417. doi: 10.1126/science.7046051
Radic Z, Reiner E, Taylor P (1991) Role of the peripheral anionic site on acetylcholinesterase: inhibition by substrates and coumarin derivatives. Mol Pharmacol 39: 98–104
Pang YP, Quiram P, Jelacic T, Hong F, Brimijoin S (1996) Highly potent, selective, and low cost bis-tetrahydroaminacrine inhibitors of acetylcholinesterase: steps towar novel drugs for treating Alzheimer’s disease. J Biol Chem 271: 23646–23649. doi: 10.1074/jbc.271.39.23646
Katz RA, Skalka AM (1994) The retroviral enzymes. Annu Rev Biochem 63: 133–173. doi: 10.1146/annurev.bi.63.070194.001025
Kempf DJ, Marsh KC, Denissen JF, McDonald E, Vasavanonda S, Flentge CA, Green BE, Fino L, Park CH, Kong XP, Wideburg NE, Saldivar A, Ruiz L, Kati WM, Sham HL, Robins T, Stewart KD, Hsu A, Plattner JJ, Leonard JM, Norbeck DW (1995) ABT-538 is a potent inhibitor of human immunodeficiency virus protease and has high oral bioavailability in humans. Proc Natl Acad Sci USA 92: 2484–2488. doi: 10.1073/pnas.92.7.2484
Reddy P, Ross J (1999) Amprenavir: a protease inhibitor for the treatment of patients with HIV-1 infection. Formulary 34: 567–577
Vacca JP, Dorsey BD, Schleif WA, Levin RB, McDaniel SL, Darke PL, Zugay J, Quintero JC, Blahy OM, Roth E, Sardana VV, Schlabach AJ, Graham PI, Condra JH, Gotlib L, Holloway MK, Lin J, Chen IW, Vastag K, Ostovic D, Anderson PS, Emini EA, Huff JR (1994) L-735,524: an orally bioavailable human immunodeficiency virus type 1 protease inhibitor. Proc Natl Acad Sci USA 91: 4096–4100. doi: 10.1073/pnas.91.9.4096
Castle NA (1999) Recent advances in the biology of small conductance calcium-activated potassium channels. Perspect Drug Discov Des 15: 131–154. doi: 10.1023/A:1017095519863
Vergara C, LaTorre R, Marrion NV, Adelman JP (1998) Calcium-activated potassium channels. Curr Opin Neurobiol 8: 321–329. doi: 10.1016/S0959-4388(98)80056-1
Wulff H, Miller MJ, Hänsel W, Grissmer S, Cahalan MD, Chandy KG (2000) Design of a potent and selective inhibitor of the intermediate-conductance Ca2+-activated K+ channel, IKCa1: a potential immunosuppressant. Proc Natl Acad Sci USA 97: 8151–8156. doi: 10.1073/pnas.97.14.8151
Engel JC, Doyle PS, Palmer J, Hsieh I, Bainton DF, McKerrow JH (1998) Growth arrest of T. cruzi by cysteine protease inhibitors is accompanied by alterations in Golgi complex and ER ultrastructure. J Cell Sci 111: 597–606
Zhang H, Xiang ML, Ma CY, Huang Q, Li W, Xie Y, Wei YQ, Yang SY (2009) Three-class classification models of logS and logP derived by using GA-CG-SVM approach. Mol Divers 13: 261–268. doi: 10.1007/s11030-009-9108-1
Ramosde Armas R, Gonzalez-Dıaz H, Molina R, Uriarte E (2004) Markovian backbone negentropies: molecular descriptors for protein research. I. predicting protein stability in Arc repressor mutants. Proteins 56: 715–723. doi: 10.1002/prot.20159
Gonzalez-Diaz H, Molina R, Uriarte E (2005) Recognition of stable protein mutants with 3D stochastic average electrostatic potentials. FEBS Lett 579: 4297–4301. doi: 10.1016/j.febslet.2005.06.065
González-Díaz H, Vilar S, Santana L, Uriarte E (2007) Medicinal chemistry and bioinformatics-current trends in drugs discovery with networks topological indices. Curr Top Med Chem 7: 1015–1029. doi: 10.2174/156802607780906771
Vilar S, Gonzalez-Diaz H, Santana L, Uriarte E (2008) QSAR model for alignment-free prediction of human breast cancer biomarkers based on electrostatic potentials of protein pseudofolding HP-lattice networks. J Comput Chem 29: 2613–2622. doi: 10.1002/jcc.21016
Munteanua CR, González-Díaz H, Magalhãesa AL (2008) Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices. J Theor Biol 254: 476–482. doi: 10.1016/j.jtbi.2008.06.003
Fernández M, Caballero J, Fernández L, Abreu JI, Acosta G (2008) Classification of conformational stability of protein mutants from 3D pseudo folding graph representation of protein sequences using support vector machines. Proteins 70: 167–175. doi: 10.1002/prot.21524
Li ZC, Zhou XB, Lin YR, Zou XY (2008) Prediction of protein structure class by coupling improved genetic algorithm and support vector machine. Amino Acids 35: 581–590. doi: 10.1007/s00726-008-0084-z
Huang WL, Tung CW, Huang HL, Hwang SF, Ho SY (2007) ProLoc: prediction of protein subnuclear localization using SVM with automatic selection from physicochemical composition features. BioSystems 90: 573–581. doi: 10.1016/j.biosystems.2007.01.001
Block P, Paern J, Huallermeier E, Sanschagrin P, Sotriffer CA, Klebe G (2006) Physicochemical descriptors to discriminate protein–protein interactions in permanent and transient complexes selected by means of machine learning algorithms. Proteins 65: 607–622. doi: 10.1002/prot.21104