Prediction of structural alphabet protein blocks using data mining

Biochimie - Tập 197 - Trang 74-85 - 2022
Mirjana M. Maljković1, Nenad S. Mitić1, Alexandre G. de Brevern2
1Faculty of Mathematics, University of Belgrade, Studentski Trg 16, 11000, Belgrade, Serbia
2Université de Paris, INSERM UMR_S 1134, DSIMB, Université de la Réunion, INTS6, Rue Alexandre Cabanel, 75015, Paris, France

Tài liệu tham khảo

Badaczewska-Dawid, 2020, Computational reconstruction of atomistic protein structures from coarse-grained models, Comput. Struct. Biotechnol. J., 18, 162, 10.1016/j.csbj.2019.12.007 Pauling, 1951, The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain, Proc. Natl. Acad. Sci. Unit. States Am., 37, 205, 10.1073/pnas.37.4.205 Kabsch, 1983, Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features, Biopolymers, 22, 2577, 10.1002/bip.360221211 Unger, 1989, A 3D building blocks approach to analyzing and predicting structure of proteins, Protein Struct. Funct. Genet., 5, 355, 10.1002/prot.340050410 Offmann, 2007, Local protein structures, Curr. Bioinf., 2, 165, 10.2174/157489307781662105 Hartigan, 1979, Algorithm as 136: a K-means clustering algorithm, J. Roy. Stat. Soc. C Appl. Stat., 28, 100 Kohonen, 1988, An introduction to neural computing, Neural Network., 1, 3, 10.1016/0893-6080(88)90020-2 Kohonen, 2001 Schneider, 1998, Artificial neural networks for computer-based molecular design, Prog. Biophys. Mol. Biol., 70, 175, 10.1016/S0079-6107(98)00026-1 de Brevern, 2000, Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks, Protein Struct. Funct. Genet., 41, 271, 10.1002/1097-0134(20001115)41:3<271::AID-PROT10>3.0.CO;2-Z Tyagi, 2008, Protein structure mining using a structural alphabet, Proteins: Struct. Funct. Bioinf., 71, 920, 10.1002/prot.21776 Joseph, 2010, A short survey on protein blocks, Biophys. Rev., 2, 137, 10.1007/s12551-010-0036-1 Faure, 2019, A PyMOL plugin for an efficient 3D protein structure superimposition approach, Source Code Biol., 5 Dudev, 2007, Discovering structural motifs using a structural alphabet: application to magnesium-binding sites, BMC Bioinf., 8, 106, 10.1186/1471-2105-8-106 de Brevern, 2005, New assessment of a structural alphabet, Silico Biol., 5, 283 Etchebest, 2005, A structural alphabet for local protein structures: improved prediction methods, Proteins, 59, 810, 10.1002/prot.20458 Dong, 2008, Analysis and prediction of protein local structure based on structure alphabets, Proteins: Struct. Funct. Bioinf., 72, 163, 10.1002/prot.21904 Zimmermann, 2008, LOCUSTRA: accurate prediction of local protein structure using a two-layer support vector machine approach, J. Chem. Inf. Model., 48, 1903, 10.1021/ci800178a Rangwala, 2009, svmPRAT: SVM-based protein residue annotation toolkit, BMC Bioinf., 10, 439, 10.1186/1471-2105-10-439 Vetrivel, 2017, Knowledge-based prediction of protein backbone conformation using a structural alphabet, PLoS One, 12, 10.1371/journal.pone.0186215 Jelovic, 2018, Finding statistically significant repeats in nucleic acids and proteins, J. Comput. Biol., 25, 375, 10.1089/cmb.2017.0046 Jelović, 2021, RepeatsPlus - program for finding motifs and repeats in data sequences, J. Bioinf. Comput. Biol., 19, 2150010, 10.1142/S0219720021500104 Heffernan, 2017, Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility, Bioinformatics, 33, 2842, 10.1093/bioinformatics/btx218 Linding, 2003, Protein disorder prediction: implications for structural proteomics, Structure, 11, 1453, 10.1016/j.str.2003.10.002 Walsh, 2012, ESpritz: accurate and fast prediction of protein disorder, Bioinformatics, 28, 503, 10.1093/bioinformatics/btr682 Linding, 2003, Exploring protein sequences for globularity and disorder, Nucleic Acids Res., 31, 3701, 10.1093/nar/gkg519 Mészáros, 2018, IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding, Nucleic Acids Res., 46, W329, 10.1093/nar/gky384 Erdős, 2020, Analyzing protein disorder with IUPred2A, Curr. Protoc. Bioinf., 70, e99, 10.1002/cpbi.99 Lobanov, 2011, The Ising model for prediction of disordered residues from protein sequence alone, Phys. Biol., 8, 10.1088/1478-3975/8/3/035004 Lobanov, 2013, IsUnstruct: prediction of the residue status to be ordered or disordered in the protein chain by a method based on the Ising model, J. Biomol. Struct. Dyn., 31, 1034, 10.1080/07391102.2012.718529 Yang, 2005, RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins, Bioinformatics, 21, 3369, 10.1093/bioinformatics/bti534 Romero, 2001, Sequence complexity of disordered protein, Protein Struct. Funct. Genet., 42, 38, 10.1002/1097-0134(20010101)42:1<38::AID-PROT50>3.0.CO;2-3 Wang, 2003, PISCES: a protein sequence culling server, Bioinformatics, 19, 1589, 10.1093/bioinformatics/btg224 Wang, 2005, PISCES: recent improvements to a PDB sequence culling server, Nucleic Acids Res., 33, W94, 10.1093/nar/gki402 Berman, 2003, Announcing the worldwide protein Data Bank, Nat. Struct. Mol. Biol., 10, 980, 10.1038/nsb1203-980 Schuchhardt, 1996, Local structural motifs of protein backbones are classified by self-organizing neural networks, Protein Eng., 9, 833, 10.1093/protein/9.10.833 Barnoud, 2017, PBxplore: a tool to analyze local protein structure and deformability with Protein Blocks, PeerJ, 5, 10.7717/peerj.4013 van der Lee, 2014, Classification of intrinsically disordered regions and proteins, Chem. Rev., 114, 6589, 10.1021/cr400525m Jandrlić, 2016, Software tools for simultaneous data visualization and T cell epitopes and disorder prediction in proteins, J. Biomed. Inf., 60, 120, 10.1016/j.jbi.2016.01.016 Graves, 2012, 10.1007/978-3-642-24797-2 Graves, 2005, Framewise phoneme classification with bidirectional LSTM and other neural network architectures, Neural Network., 18, 602, 10.1016/j.neunet.2005.06.042 Agathocleous, 2010, Protein secondary structure prediction with bidirectional recurrent neural nets: can weight updating for each residue enhance performance?. 6th IFIP WG 12.5 international conference on artificial intelligence applications and innovations (AIAI), Larnaca, Cyprus., 128 IBM InfoSphere Warehouse. Creating Mining Models with Intelligent Miner Modeling Version 9.5.1. IBM SPSS Modeler 18.2 Algorithms Guide https://www.ibm.com/support/pages/spss-modeler-182-documentation (accessed 4 January 2022). Pedregosa, 2011, Scikit-learn: machine learning in Python, J. Mach. Learn. Res., 12, 2825 Chollet, 2015 Tan, 2018 Kingma, 2017