Graph representation of high-dimensional alpha-helical membrane protein data
Tóm tắt
In genomics and proteomics, membrane protein analysis have shown that such analyses are very important to support the understanding of complex biological processes. In Genome-wide investigations of membrane proteins a large number of short, distinct sequence motifs has been revealed. Such motifs found so far support the understanding of the folded membrane protein in the membrane environment. They provide important information about functional or stabilizing properties. Recently several integrative approaches have been proposed to extract meaningful information out of the membrane environment. However, many information based approaches deliver results having deficits of visualisation outputs. Outgoing from high-throughput protein data analysis, these outputs play an important role in the evaluation of high-dimensional protein data, to establish a biological relationship and ultimately to provide useful information for research. We have evaluated different resulting graphs generated from statistical analysis of consecutive motifs in helical structures of the membrane environment. Our results show that representative motifs with high occurrence in all investigated protein families are responsible for the general importance in alpha-helical membrane structure formation. Further, motifs which often occur with others in their function as so called “hubs” lead to the assumption, that these motifs constitute as important components in helical structures within the membrane. Otherwise, consecutive motifs and hubs which show a high occurrence in certain families only can be classified as important for family-specific functional characteristics. Summarized, we are able to bridge our graphical results from high-throughput analysis of membrane proteins over networking with databases to a biological context. Our results and the corresponding graphical visualisation support the understanding and interpretation of structure forming and functional motifs of membrane proteins. Our results are useful to interpret and refine results of common developed approaches. At last we show a simple way to visualise high-dimensional protein data in context to biological relevant information.
Tài liệu tham khảo
Eisenberg D, Marcotte EM, Xenarios I, Yeates TO: Protein function in the post-genomic era. Nature. 2000, 405 (6788): 823-826. 10.1038/35015694.
Luckey M: Membrane Structural Biology. 2008, Cambridge University Press
Singer SJ, Nicolson GL: The fluid mosaic model of the structure of cell membranes. Science. 1972, 175 (23): 720-731.
Venkatakrishnan A, Deupi X, Lebon G, Tate CG, Schertler GF, Babu MM: Molecular signatures of g-protein-coupled receptors. Nature. 2013, 494 (7436): 185-194. 10.1038/nature11896.
Lan N, Montelione GT, Gerstein M: Ontologies for proteomics: towards a systematic definition of structure and function that scales to the genome level. Curr Opin Chem Biol. 2003, 7 (1): 44-54. 10.1016/S1367-5931(02)00020-0.
Marsico A, Labudde D, Sapra T, Muller DJ, Schroeder M: A novel pattern recognition algorithm to classify membrane protein unfolding pathways with high-throughput single-molecule force spectroscopy. Bioinformatics. 2007, 23 (2): 231-236. 10.1093/bioinformatics/btl293.
Childers M, Eckel G, Himmel A, Caldwell J: A new model of cystic fibrosis pathology: lack of transport of glutathione and its thiocyanate conjugates. Med Hypotheses. 2007, 68 (1): 101-112. 10.1016/j.mehy.2006.06.020.
Rowe SM, Miller S, Sorscher EJ: Cystic fibrosis. N Engl J Med. 2005, 352 (19): 1992-2001. 10.1056/NEJMra043184.
Liu Y, Engelman DM, Gerstein M: Genomic analysis of membrane protein families: abundance and conserved motifs. Genome Biol. 2002, 3 (10): 1-0054.
Arkin IT: Statistical analysis of predicted transmembrane α-helices. Biochimica et Biophysica Acta (BBA)-Protein Struct Mol Enzymol. 1998, 1429 (1): 113-128. 10.1016/S0167-4838(98)00225-8.
Senes A, Gerstein M, Engelman D M: Statistical analysis of amino acid patterns in transmembrane helices: The gxxxg motif occurs frequently, and in association with beta-branched residues at neighboring positions. J Mol Biol. 2000, 296 (3): 921-936. 10.1006/jmbi.1999.3488.
Russ WP, Engelman D M: The gxxxg motif: a framework for transmembrane helix-helix association. J Mol Biol. 2000, 296 (3): 911-919. 10.1006/jmbi.1999.3489.
Senes A, Engel DE, DeGrado WF: Folding of helical membrane proteins: the role of polar, gxxxg-like and proline motifs. Curr Opin Struct Biol. 2004, 14 (4): 465-479. 10.1016/j.sbi.2004.07.007.
Grunert S, Heinke F, Labudde D: Structure topology prediction of discriminative sequence motifs in membrane proteins with domains of unknown functions. Struct Biol. 2013, 2013: 10-
Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer ELL, Eddy SR, Bateman A, Finn RD: The pfam protein families database. Nucleic Acids Res. 2012, 40 (Database issue): 290-301.http://dx.doi.org/10.1093/nar/gkr1065,
Li W, Godzik A: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006, 22 (13): 1658-1659. 10.1093/bioinformatics/btl158.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. Mol Biol. 1990, 215 (3): 403-410.
Sonnhammer EL, von Heijne, Krogh A: A hidden markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol. 1998, 6: 175-182.
Schiffer M, Edmundson AB: Use of helical wheels to represent the structures of proteins and to identify segments with helical potential. Biophys J. 1967, 7: 121-135. 10.1016/S0006-3495(67)86579-2.
Schuster-Böckler B, Schultz J, Rahman S: Hmm logos for visualization of protein families. 2004,http://dx.doi.org/10.1186/1471-2105-5-7,
Sigrist CJ, de Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A, Bougueleret L, Xenarios I: New and continuing developments at prosite. Nucleic Acids Res. 2013, 41 (D1): 344-347. 10.1093/nar/gks1067.
Sigrist CJ, Cerutti L, Hulo N, Gattiker A, Falquet L, Pagni M, Bairoch A, Bucher P: Prosite: a documented database using patterns and profiles as motif descriptors. Brief Bioinform. 2002, 3 (3): 265-274. 10.1093/bib/3.3.265.
de Castro E, Sigrist CJ, Gattiker A, Bulliard V, Langendijk-Genevaux PS, Gasteiger E, Bairoch A, Hulo N: Scanprosite: detection of prosite signature matches and prorule-associated functional and structural residues in proteins. Nucleic Acids Res. 2006, 34 (suppl 2): 362-365.
Sigrist CJ, De Castro E, Langendijk-Genevaux PS, Le Saux, Bairoch A, Hulo N: Prorule: a new database containing functional and structural information on prosite profiles. Bioinformatics. 2005, 21 (21): 4060-4066. 10.1093/bioinformatics/bti614.