Resolving deconvolution ambiguity in gene alternative splicing
Tóm tắt
For many gene structures it is impossible to resolve intensity data uniquely to establish abundances of splice variants. This was empirically noted by Wang et al. in which it was called a "degeneracy problem". The ambiguity results from an ill-posed problem where additional information is needed in order to obtain an unique answer in splice variant deconvolution. In this paper, we analyze the situations under which the problem occurs and perform a rigorous mathematical study which gives necessary and sufficient conditions on how many and what type of constraints are needed to resolve all ambiguity. This analysis is generally applicable to matrix models of splice variants. We explore the proposal that probe sequence information may provide sufficient additional constraints to resolve real-world instances. However, probe behavior cannot be predicted with sufficient accuracy by any existing probe sequence model, and so we present a Bayesian framework for estimating variant abundances by incorporating the prediction uncertainty from the micro-model of probe responsiveness into the macro-model of probe intensities. The matrix analysis of constraints provides a tool for detecting real-world instances in which additional constraints may be necessary to resolve splice variants. While purely mathematical constraints can be stated without error, real-world constraints may themselves be poorly resolved. Our Bayesian framework provides a generic solution to the problem of uniquely estimating transcript abundances given additional constraints that themselves may be uncertain, such as regression fit to probe sequence models. We demonstrate the efficacy of it by extensive simulations as well as various biological data.
Tài liệu tham khảo
Johnson J, Castle J, Garrett-Engele P, Kan Z, Loerch P, Armour C, Santos R, Schadt E, Stoughton R, Shoemaker D: Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science 2003, 302(5654):2141–2144. 10.1126/science.1090100
Coschigano K, Wensink P: Sex-specific transcriptional regulation by the male and female doublesex proteins of Drosophila. Genes Dev 1993, 7: 42–45. 10.1101/gad.7.1.42
Jiang Z, Wu J: Alternative splicing and programmed cell death. Proceedings of the Society for Experimental Biology and Medicine 1999, 220: 64–72. 10.1046/j.1525-1373.1999.d01-11.x
Black D: Protein Diversity from Alternative Splicing: A Challenge for Bioinformatics and Post-Genome Biology. Cell 2000, 103: 367–370. 10.1016/S0092-8674(00)00128-8
Breitbart R, Andreadis A, Nadal-Ginard B: Alternative Splicing: a Ubiquitous Mechanism for the Generation of Multiple Protein Isoforms from Single Genes. Annual Review of Biochemistry 1987, 56: 467–495. 10.1146/annurev.bi.56.070187.002343
Grabowski P, Black D: Alternative Splicing in Nervous Systems. Progress in Neurobiology 2001, 65: 289–308. 10.1016/S0301-0082(01)00007-7
Herrera-Gayol A, Jothy S: Adhesion Proteins in the Biology of Breast Cancer: Contribution of CD44. Experimental and Molecular Pathology 1999, 66: 149–156. 10.1006/exmp.1999.2251
Smith C, Patton J, Nadal-Ginard B: Alternative splicing in the control gene expression. Annu Rev Genet 1989, 23: 527–577. 10.1146/annurev.ge.23.120189.002523
Lopez J: ALTERNATIVE SPLICING OF PRE-mRNA: Developmental Consequences and Mechanisms of Regulation. Annual Reveiw of Genetics 1998, 32: 279–305. 10.1146/annurev.genet.32.1.279
Coschigano K, Wensink P: Sex-specific transcriptional regulation by the male and female doublesex proteins of Drosophila. Genes Dev 1993, 7: 42–45. 10.1101/gad.7.1.42
Lopez J: Developmental role of transcription factor isoforms generated by alernative splicing. Dev Biology 1995, 172: 396–411. 10.1006/dbio.1995.8050
Sherman L, Wainwright D, Ponta H, Herrlich P: A splice variant of CD44 expressed in the apical ectodermal ridge presents fibroblast growth factors to limb messenchyme and is required for limb outgrowth. Gene Dev 1998, 12: 1058–1071. 10.1101/gad.12.7.1058
Boise L, Gonzalez-Garcia M, Postema C, Ding L, Lindsten T, Turka L, Mao X, Nunez G, Thompson C: bcl-x, a bcl-2-related gene that functions as a dominant regulator of apoptotic cell death. Cell 1993, 74(4):597–608. 10.1016/0092-8674(93)90508-N
Schiafino S, Reggiani C: Molecular diversity of myofibrillar proteins: gene regulation and functional significanc. Physiol Rev 1996, 76: 371–423.
MacDougall C, Harbison D, Bownes M: The developmental consequences of alternate splicing in sex determination and differentiation in Drosophila. Dev Biol 1995, 172: 353–376. 10.1006/dbio.1995.8047
Meyer T, Fromm A, Munch C, Schwalenstocker B, Fray A, Ince P, Stamm S, Gron G, Ludolph A, Shaw P: The RNA of the glutamate transporter EAAT2 is variably spliced in amyotrophic lateral sclerosis and normal individuals. Journal of Neurol Sci 1999, 170: 45–50. 10.1016/S0022-510X(99)00196-3
Buée L, Bussière T, Buée-Scherrer V, Delacourte A, Hof PR: Tau protein isoforms, phosphorylation and role in neurodegenerative disorders. Brain Res Brain Res Rev 2000, 33: 95–130. 10.1016/S0165-0173(00)00019-9
Huntsman M, Tran BV, Potkin S, Bunney W Jr, Jones E: Altered ratios of alternatively spliced long and short gamma2 subunit mRNAs of the gamma-amino butyrate type A receptor in prefrontal cortex of schizophrenics. Proc Natl Acad Sci USA 1998, 95: 15066–15071. 10.1073/pnas.95.25.15066
Vawter M, Frye M, Hemperly J, VanderPutten D, Usen N, Doherty P, Saffell J, Issa F, Post R, Wyatt R, Freed W: Elevated concentration of N-CAM VASE isoforms in schizophrenia. J Psychiatry Res 2000, 34: 25–34. 10.1016/S0022-3956(99)00026-6
Le Corre S, Harper C, Lopez P, Ward P, Catts S: Increased levels of expression of an NMDARI splice variant in the superior temporal gyrus in schizophrenia. Neuro Report 2000, 11: 983–986.
Gunthert U, Hofmann M, Rudy W, Reber S, Zoller M, Haussmann I, Matzku S, Wenzel A, Ponta H, Herrlich P: A new variant of glycoprotein CD44 confers metastatic potential to rat carcinoma cells. Cell 1991, 65: 13–24. 10.1016/0092-8674(91)90403-L
Dredge B, Polydorides A, Darnell R: The Splice of Life: Alternative Splicing and Neurological Disease. Nature 2001, 2: 43–50.
Lin C, Bristol L, Jin L, Dykes-Hoberg M, Crawford T, Clawson L, Rothstein J: Aberrant RNA processing in a neurodegenerative disease: the cause for absent EAAT2, a glutamate transporter, in amyotrophic lateral sclerosis. Neuron 1998, 20(3):589–602. 10.1016/S0896-6273(00)80997-6
Hutton M, Lendon C, Rizzu P, Baker M, Froelich S, et al.: Association of missense and 5'-splice-site mutations in tau with the inherited dementia FTDP-17. Nature 1998, 393: 702–705. 10.1038/31508
Yamakawa K, Huo YK, Haendel M, Hubert R, Chen XN, Lyons G, Korenberg J: DSCAM: a novel member of the immunoglobulin superfamily maps in a Down syndrome region and is involved in the development of the nervous system. Human Molecular Genetics 1998, 7(2):227–237. 10.1093/hmg/7.2.227
Clark T, CW S, Ares M: Genomewide Analysis of mRNA Processing in Yeast Using Splicing-Specific Microarrays. Science 2002, 296(3):907–910. 10.1126/science.1069415
Yeakley J, Fan JB, Doucet D, Luo L, Wickham E, Ye Z, Chee M, Fu XD: Profiling alternative splicing on fiber-optic arrays. Nature Biotechnology 2002, 20(4):353–358. 10.1038/nbt0402-353
Ule J, Ule A, Spencer J, Williams A, Hu JS, Cline M, Wang H, Clark T, Fraser C, Ruggiu M, Zeeberg B, Kane D, Weinstein J, Blume J, Darnell R: Nova regulates brain-specific splicing to shape the synapse. Nature Genetics 2005, 37(8):844–852. 10.1038/ng1610
Le K, Mitsouras K, Roy M, Wang Q, Xu Q, Nelson S, Lee C: Detecting tissue-specific regulation of alternative splicing as a qualitative change in microarray data. Nucleic Acids Research 2004, 32(22):e180. 10.1093/nar/gnh173
Hu G, Madore S, Moldover B, Jatkoe T, Balaban D, Thomas J, Wang Y: Predicting Splice Variant from DNA Chip Expression Data. Genome Research 2001, 11(7):1237–1245. 10.1101/gr.165501
Relogio A, Ben-Dov C, Baum M, Ruggiu M, Gemund C, Benes V, Darnell R, Valcarcel J: Alternative Splicing Microarrays Reveal Functional Expression of Neuron-specific Regulators in Hodgkin Lymphoma Cells. Journal of Biological Chemistry 2005, 280(6):4779–4784. 10.1074/jbc.M411976200
Blanchette M, Green R, Brenner S, Rio D: Global Analysis of Positive and Negative pre-mRNA splicing regulators in Drosophila. Genes and Development 2005, 19(6):1306–1314. 10.1101/gad.1314205
Cline M, Blume J, Cawley S, Clark T, Hu J, Lu G, Salomonis N, Wang H, Williams A: ANOSVA: a statistical method for detecting splice variation from expression data. Bioinformatics 2005, 21(Suppl 1):i107-i115. 10.1093/bioinformatics/bti1010
Li C, Wong W: Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceeding of National Academy of Science 2001, 98: 31–36. 10.1073/pnas.011404098
Irizarry R, Bolstad B, Collins F, Cope L, Hobbs B, Speed T: Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Research 2003, 3(4):e15. 10.1093/nar/gng015
Wang H, Hubbell E, Hu J, Mei G, Cline M, Lu G, Clark T, Siani-Rose M, Ares M, Kulp D, Haussler D: Gene structure-based splice variant deconvolution using a microarray platform. Bioinformatics 2003, 19(Suppl 1):i315-i322. 10.1093/bioinformatics/btg1044
Shai O, Morris Q, Blencowe B, Frey B: Inferring global levels of alternative splicing isoforms using a generateive model of microarray data. Bioinformatics 2006, 22(5):606–613. 10.1093/bioinformatics/btk028
MATLAB version 7. Natick, Massachusetts: The MathWorks Inc; 2004.
Anton MA, Gorostiaga D, Guruceaga E, Segura V, Carmona-Saez P, Pascual-Montano A, Pio R, Montuenga LM, Rubio A: SPACE: an algorithm to predict and quantify alternatively spliced isoforms using microarrays. Genome Biology 2008, 9: R46+. 10.1186/gb-2008-9-2-r46
Alex P, Fan CJ: Computing the Block Triangular Form of a Sparse Matrix. ACM Transactions on Mathematical Software 1990, 16(4):303–324. 10.1145/98267.98287
Wu C, Carta R, Zhang L: Sequence dependence of cross-hybridization on short oligo microarrays. Nucleic Acids Research 2005, 33(9):e84. 10.1093/nar/gni082
Affymetrix:Guide to Probe Logarithmic Intensity Error (PLIER) Estimation. [http://www.affymetrix.com/support/technical/technotes/plier_technote.pdf] 10.1038/nbt836
Zhang L, Miles M, Aldape K: A model of molecular interactions on short oligonucleotide microarrays. Nature Biotechnology 2003, 21(7):818–821. 10.1073/pnas.1534744100
Mei R, Hubblell E, Bekiranov S, Mittmann M, Christians F, Shen M, Lu G, Fang J, Liu W, Ryder T, Kaplan P, Kulp D, Webster T: Probe selection for high-density oligonucleotide arrays. Proceeding of National Academy of Science 2003, 100(20):11237–11242. 10.1103/PhysRevE.68.011906
Naef F, Magnasco M: Solving the riddle of the bright mismatches: Labeling and effective binding in oligonucleotide arrays. Physical Review E 2003, 68: 011906. 10.1198/016214504000000683
Wu Z, Irizarry R, Gentleman R, Martinez-Murillo F, Spencer F: A Model-Based Background Adjustment for Oligonucleotide Expression Arrays. Journal of the American Statistical Association 2004, 99: 909–917. 10.1198/016214504000000683
Lacroix V, Sammeth M, Guig R, Bergeron A: Exact Transcriptome Reconstruction from Short Sequence Reads. In WABI, Lecture Notes in Computer Science. Edited by: Crandall KA, Lagergren J. Springer; 50–63.
