Bayesian estimation of the discrete coefficient of determination
Tóm tắt
The discrete coefficient of determination (CoD) measures the nonlinear interaction between discrete predictor and target variables and has had far-reaching applications in Genomic Signal Processing. Previous work has addressed the inference of the discrete CoD using classical parametric and nonparametric approaches. In this paper, we introduce a Bayesian framework for the inference of the discrete CoD. We derive analytically the optimal minimum mean-square error (MMSE) CoD estimator, as well as a CoD estimator based on the Optimal Bayesian Predictor (OBP). For the latter estimator, exact expressions for its bias, variance, and root-mean-square (RMS) are given. The accuracy of both Bayesian CoD estimators with non-informative and informative priors, under fixed or random parameters, is studied via analytical and numerical approaches. We also demonstrate the application of the proposed Bayesian approach in the inference of gene regulatory networks, using gene-expression data from a previously published study on metastatic melanoma.
Tài liệu tham khảo
S Kauffman, Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor. Biol. 22(3), 437–467 (1969).
S Kauffman, The Origins of Order: Self-Organization and Selection in Evolution (Oxford University Press, New York, NY, 1993).
S Bornholdt, Boolean network models of cellular regulation: prospects and limitations. J. R. Soc. Interface. 5(1), S85—S94 (2008).
R Albert, H Othmer, The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in drosophila melanogaster. J. Theor. Biol. 223(1), 1–18 (2003).
F Li, YLu T Long, Q Ouyang, C Tang, The yeast cell-cycle network is robustly designed. Proc. Natl. Acad. Sci. U.S.A.101(14), 4781–4876 (2004).
A Faure, A Naldi, C Chaouiya, D Thieffry, Dynamical analysis of a generic boolean model for the control of the mammalian cell cycle. Bionformatics. 22(14), 124–131 (2006).
ER Dougherty, S Kim, Y Chen, Coefficient of determination in nonlinear signal processing. EURASIP J. Signal Process. 80(10), 2219–2235 (2000).
S Kim, ER Dougherty, Y Chen, K Sivakumar, P Meltzer, JM Trent, M Bittner, Multivariate measurement of gene expression relationships. Genom. 67(2), 201–209 (2000).
X Zhou, X Wang, ER Dougherty, Binarization of microarray data based on a mixture model. Mol. Cancer Ther. 2(7), 679–684 (2003).
S Kim, ER Dougherty, ML Bittner, Y Chen, K Sivakumar, P Meltzer, JM Trent, General nonlinear framework for the analysis of gene interaction via multivariate expression arrays. J. Biomed. Opt. 5(4), 411–424 (2000).
I Shmulevich, ER Dougherty, S Kim, W Zhang, Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinforma. 18(2), 261–274 (2002).
D Martins, U Braga-Neto, R Hashimoto, M Bittner, ER Dougherty, Intrinsically multivariate predictive genes. IEEE J. Sel. Top. Sign. Proces. 2(3), 424–439 (2008).
T Chen, UM Braga-Neto, Statistical detection of intrinsically multivariate predictive genes. IEEE/ACM Trans. Comput. Biol. Bioinform. 12(4), 951–964 (2015).
T Chen, UM Braga-Neto, Exact performance of CoD estimators in discrete prediction. EURASIP J. Adv. Signal Process (2010). (Article ID 2010:487893).
T Chen, UM Braga-Neto, Maximum-likelihood estimation of the discrete coefficient of determination in stochastic boolean systems. IEEE Trans. Signal Process. 61(15), 3880–3894 (2013).
T Chen, UM Braga-Neto, Statistical detection of Boolean regulatory relationships. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(5), 1310–1321 (2013).
LA Dalton, ER Dougherty, Bayesian minimum mean-square error estimation for classification error – Part I: Definition and the Bayesian mmse error estimator for discrete classification. IEEE Trans. Signal Process. 59(1), 115–129 (2011).
LA Dalton, ER Dougherty, Bayesian minimum mean-square error estimation for classification error – Part II: Linear classification of gaussian models. IEEE Trans. Signal Process. 59(1), 130–144 (2011).
T Chen, UM Braga-Neto, in In Proceedings of the 2013 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS’2013). Optimal Bayesian MMSE estimation of the coefficient of determination for discrete prediction (TXHouston, Nov 2013), pp. 66–69.
LA Dalton, ER Dougherty, Optimal classifiers with minimum expected error within a Bayesian framework – Part I: Discrete and gaussian models. Pattern Recogn. 46(5), 1301–1314 (2013).
LA Dalton, ER Dougherty, Optimal classifiers with minimum expected error within a Bayesian framework – Part II: Properties and performance analysis. Pattern Recogn. 46(5), 1288–1300 (2013).
L Devroye, L Gyorfi, G Lugosi, A Probabilistic Theory of Pattern Recognition (Springer, New York, 1996).
G Casella, R Berger, Statistical Inference, 2nd ed (Pacific Grove, CA, Duxbury, 2002).
M Bittner, P Meltzer, Y Chen, Y Jiang, E Seftor, M Hendrix, M Radmacher, R Simon, Z Yakhini, A Ben-Dor, N Sampas, ER Dougherty, F Marincola, E Wang, C Gooden, J Lueders, A Glatfelter, P Pollock, J Carpten, E Gillanders, D Leja, K Dietrich, C Beaudry, M Berens, D Alberts, V Sondak, N Hayward, J Trent, Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature. 406:, 536–540 (2000).
S Kim, ER Dougherty, N Cao, Y Chen, M Bittner, E Suh, Can markov chain models mimic biological regulation?J. Biol. Syst. 10:, 437–458 (2002).
A Datta, A Choudhary, M Bittner, ER Dougherty, External control in markovian genetic regulatory networks. Mach. Learn. 52:, 169–191 (2003).
UM Braga-Neto, ER Dougherty, Error Estimation for Pattern Recognition (Wiley, New York, 2015).
S Ross, A first course in probability, 4th ed (Macmillan, New York, 1994).
G Arfken, Mathematical Methods for Physicists, 3rd ed (Academic Press, Orlando, FL, 1985).
N Balakrishnan, V Nevzorov, A Primer on Statistical Distributions (Wiley, Hoboken, NJ, 2003).