Các Biện Pháp Bayesian Cho Độ Phức Tạp và Độ Khớp Của Mô Hình

David J. Spiegelhalter1, Nicola Best2, Bradley P. Carlin3, Angelika van der Linde4
1Medical Research Council Biostatistics Unit, Cambridge, UK
2Imperial College School of Medicine, London, UK
3University of Minnesota, Minneapolis, USA
4University of Bremen, Germany

Tóm tắt

Tóm tắt

Chúng tôi xem xét vấn đề so sánh các mô hình phân cấp phức tạp trong đó số lượng tham số không được xác định rõ. Sử dụng lập luận thông tin lý thuyết, chúng tôi đưa ra một thước đo pD cho số lượng tham số hiệu quả trong một mô hình như sự khác biệt giữa trung bình hậu nghiệm của độ lệch và độ lệch tại giá trị trung bình hậu nghiệm của các tham số quan trọng. Nói chung pD tương quan xấp xỉ với vết của tích giữa thông tin Fisher và hiệp phương sai hậu nghiệm, trong các mô hình chuẩn là vết của ma trận ‘hat’ chiếu các quan sát lên giá trị được khớp. Các tính chất của nó trong các họ số mũ được khảo sát. Trung bình hậu nghiệm của độ lệch được đề xuất như một biện pháp đo lường Bayesian về sự phù hợp hoặc đủ, và sự đóng góp của các quan sát riêng lẻ đến sự phù hợp và độ phức tạp có thể dẫn đến một biểu đồ chuẩn đoán của phần dư độ lệch so với đòn bẩy. Việc thêm pD vào trung bình hậu nghiệm độ lệch tạo ra tiêu chuẩn thông tin độ lệch để so sánh các mô hình, liên quan đến các tiêu chuẩn thông tin khác và có một sự biện hộ xấp xỉ quyết định lý thuyết. Quy trình được minh họa trong một số ví dụ, và các so sánh được thực hiện với các đề xuất Bayesian và cổ điển khác. Suốt cả quá trình, nhấn mạnh rằng lượng cần thiết để tính toán trong phân tích Markov chain Monte Carlo là không đáng kể.

Từ khóa

#Mô hình phân cấp phức tạp #thông tin lý thuyết #số lượng tham số hiệu quả #độ lệch hậu nghiệm #phương sai hậu nghiệm #ma trận 'hat' #các họ số mũ #biện pháp đo lường Bayesian #biểu đồ chuẩn đoán #Markov chain Monte Carlo #tiêu chuẩn thông tin độ lệch.

Tài liệu tham khảo

Akaike, 1973, Proc. 2nd Int. Symp. Information Theory, 267

Andrews, 1974, Scale mixtures of normal distributions, J. R. Statist. Soc., 36, 99

Berk, 1966, Limiting behaviour of posterior distributions when the model is incorrect, Ann. Math. Statist., 37, 51, 10.1214/aoms/1177699597

Bernardo, 1979, Expected information as expected utility, Ann. Statist., 7, 686, 10.1214/aos/1176344689

Bernardo, 1994, Bayesian Theory, 10.1002/9780470316870

Besag, 1974, Spatial interaction and the statistical analysis of lattice systems (with discussion), J. R. Statist. Soc., 36, 192

Biller, 2001, Bayesian varying-coefficient modelsusing adaptive regression splines, Statist. Modlng, 1, 195, 10.1177/1471082X0100100303

Box, 1976, Science and statistics, J. Am. Statist. Ass., 71, 791, 10.1080/01621459.1976.10480949

Breslow, 1993, Approximate inference in generalized linear mixed models, J. Am. Statist. Ass., 88, 9

Brownlee, 1965, Statistical Theory and Methodology in Science and Engineering

Bunke, 1998, Asymptotic behaviour of Bayes estimates under possibly incorrect models, Ann. Statist., 26, 617, 10.1214/aos/1028144851

Burnham, 1998, Model Selection and Inference, 10.1007/978-1-4757-2917-7

Carlin, 2000, Bayes and Empirical Bayes Methods for Data Analysis, 10.1201/9781420057669

Chib, 1998, Analysis of multivariate probit models, Biometrika, 85, 347, 10.1093/biomet/85.2.347

Clayton, 1987, Empirical Bayes estimates of age-standardised relative risks for use in disease mapping, Biometrics, 43, 671, 10.2307/2532003

Dempster, 1974, Proc. Conf. Foundational Questions in Statistical Inference, 335

1997, The direct use of likelihood for significance testing, Statist. Comput., 7, 247, 10.1023/A:1018598421607

1997, Commentary on the paper by Murray Aitkin, and on discussion by Mervyn Stone, Statist. Comput., 7, 265, 10.1023/A:1018554606586

Efron, 1986, How biased is the apparent error rate of a prediction rule, J. Am. Statist. Ass., 81, 461, 10.1080/01621459.1986.10478291

Erkanli, 2001, Bayesian analyses of longitudinal binary data using markov regression models of unknown order, Statist. Med., 20, 755, 10.1002/sim.702

Erkanli, 1999, Bayesian inference for prevalence in longitudinal two-phase studies, Biometrics, 55, 1145, 10.1111/j.0006-341X.1999.01145.x

Eubank, 1985, Diagnostics for smoothing splines, J. R. Statist. Soc., 47, 332

Eubank, 1986, Diagnostics for penalized least-squares estimators, Statist. Probab. Lett., 4, 265, 10.1016/0167-7152(86)90101-X

Fitzmaurice, 1993, A likelihood-based method for analysing longitudinal binary responses, Biometrika, 80, 141, 10.1093/biomet/80.1.141

Gelfand, 1994, Bayesian model choice: asymptotics and exact calculations, J. R. Statist. Soc., 56, 501

Gelfand, 2000, Conditional categorical response models with application to treatment of acute myocardial infarction, Appl. Statist., 49, 171

Gelfand, 1998, Model choice: a minimum posterior predictive loss approach, Biometrika, 85, 1, 10.1093/biomet/85.1.1

Gelfand, 2002, Technical Report

Gilks, 1996, Markov Chain Monte Carlo in Practice

Gilks, 1993, Random-effects models for longitudinal data using Gibbs sampling, Biometrics, 49, 441, 10.2307/2532557

Good, 1956, The surprise index for the multivariate normal distribution, Ann. Math. Statist., 27, 1130, 10.1214/aoms/1177728079

Green, 2002, J. Am. Statist. Ass.

Han, 2001, MCMC methods for computing Bayes factors: a comparative review, J. Am. Statist. Ass., 96, 1122, 10.1198/016214501753208780

Hastie, 1990, Generalized Additive Models

Hodges, 2001, Counting degrees of freedom in hierarchical and other richly-parameterised models, Biometrika, 88, 367, 10.1093/biomet/88.2.367

Huber, 1967, Proc. 5th Berkeley Symp. Mathematical Statistics and Probability, 221

Kass, 1995, Bayes factors and model uncertainty, J. Am. Statist. Ass., 90, 773, 10.1080/01621459.1995.10476572

Key, 1999, Bayesian Statistics 6, 343, 10.1093/oso/9780198504856.003.0015

Kimeldorf, 1970, A correspondence between Bayesian estimation on stochastic processes and smoothing by splines, Ann. Math. Statist., 41, 495, 10.1214/aoms/1177697089

Kullback, 1951, On information and sufficiency, Ann. Math. Statist., 22, 79, 10.1214/aoms/1177729694

Laird, 1982, Random effects models for longitudinal data, Biometrics, 38, 963, 10.2307/2529876

Laud, 1995, Predictive model selection, J. R. Statist. Soc., 57, 247

Lee, 1996, Hierarchical generalized linear models (with discussion), J. R. Statist. Soc., 58, 619

Linde, 1995, Splines from a Bayesian point of view, Test, 4, 63, 10.1007/BF02563103

2000, Reference priors for shrinkage and smoothing parameters, J. Statist. Planng Inf., 90, 245, 10.1016/S0378-3758(00)00116-6

Lindley, 1972, Bayes estimates for the linear model (with discussion), J. R. Statist. Soc., 34, 1

MacKay, 1992, Bayesian interpolation, Neur. Computn, 4, 415, 10.1162/neco.1992.4.3.415

1995, Probable networks and plausible predictions—a review of practical Bayesian methods for supervised neural networks, Netwrk Computn Neur. Syst., 6, 469, 10.1088/0954-898X_6_3_011

McCullagh, 1989, Generalized Linear Models, 10.1007/978-1-4899-3242-6

Meng, 1992, Performing likelihood ratio tests with multiply imputed data sets, Biometrika, 79, 103, 10.1093/biomet/79.1.103

Moody, 1992, Advances in Neural Information Processing Systems 4, 847

Murata, 1994, Network information criterion—determining the number of hidden units for artificial neural network models, IEEE Trans. Neur. Netwrks, 5, 865, 10.1109/72.329683

Natarajan, 2000, Reference Bayesian methods for generalised linear mixed models, J. Am. Statist. Ass., 95, 227, 10.1080/01621459.2000.10473916

Raghunathan, 1988, Technical Report

Rahman, 1999, The Bayesian analysis of a pivotal pharmacokinetic study, Statist. Meth. Med. Res., 8, 195, 10.1177/096228029900800303

Richardson, 1997, On Bayesian analysis of mixtures with an unknown number of components (with discussion), J. R. Statist. Soc., 59, 731, 10.1111/1467-9868.00095

Ripley, 1996, Pattern Recognition and Neural Networks, 10.1017/CBO9780511812651

Sawa, 1978, Information criteria for choice of regression models: a comment, Econometrica, 46, 1273, 10.2307/1913828

Schwarz, 1978, Estimating the dimension of a model, Ann. Statist., 6, 461, 10.1214/aos/1176344136

Slate, 1994, Parameterizations for natural exponential-families with quadratic variance functions, J. Am. Statist. Ass., 89, 1471, 10.1080/01621459.1994.10476886

Spiegelhalter, 2000, WinBUGS Version 1.3 User Manual

Spiegelhalter, 1996, BUGS Examples Volume 1, Version 0.5 (Version ii)

Stone, 1977, An asymptotic equivalence of choice of model by cross-validation and Akaike's criterion, J. R. Statist. Soc., 39, 44

Takeuchi, 1976, Distribution of informational statistics and a criterion for model fitting (in Japanese), Suri-Kagaku, 153, 12

Vehtari, 1999, IJCNN’99: Proc. 1999. Int. Joint Conf. Neural Networks

Wahba, 1978, Improper priors, spline smoothing and the problem of guarding against model errors in regressions, J. R. Statist. Soc., 40, 364

1983, Bayesian ‘‘confidence intervals’’ for the cross-validated smoothing spline, J. R. Statist. Soc., 45, 133

1990, Spline Models for Observational Data

Ye, 1998, On measuring and correcting the effects of data mining and model selection, J. Am. Statist. Ass., 93, 120, 10.1080/01621459.1998.10474094

Ye, 1998, Technical Report

Zeger, 1991, Generalised linear models with random effects; a Gibbs sampling approach, J. Am. Statist. Ass., 86, 79, 10.1080/01621459.1991.10475006

Zhu, 2000, Comparing hierarchical models for spatio-temporally misaligned data using the deviance information criterion, Statist. Med., 19, 2265, 10.1002/1097-0258(20000915/30)19:17/18<2265::AID-SIM568>3.0.CO;2-6

Aitkin, 1991, Posterior Bayes factors (with discussion), J. R. Statist. Soc., 53, 111

Akaike, 1973, Proc. 2nd Int. Symp. Information Theory, 267

Atkinson, 1980, A note on the generalized information criterion for choice of a model, Biometrika, 67, 413, 10.1093/biomet/67.2.413

Atkinson, 2000, Robust Diagnostic Regression Analysis, 10.1007/978-1-4612-1160-0

2002, Technical Report LSERR73

Bernardo, Expected information as expected utility, Ann. Statist., 7, 686

Bernardo, 1999, Bayesian Statistics 6, 101, 10.1093/oso/9780198504856.003.0005

Bernardo, 1994, Bayesian Theory, 10.1002/9780470316870

Bernardo, 2002, 7th Valencia Int. Meet. Bayesian Statistics, Tenerife, June

Burnham, 1998, Model Selection and Inference: a Practical Information-theoretic Approach, 10.1007/978-1-4757-2917-7

2002, Model Selection and Multimodel Inference: a Practical Information-theoretical Approach

Casella, 2000, Mixture models, latent variables and partitioned importance sampling

Celeux, 2000, Computational and inferential difficulties with mixtures posterior distribution, J. Am. Statist. Ass., 95, 957, 10.1080/01621459.2000.10474285

Cooke, 1991, Experts in Uncertainty, 10.1093/oso/9780195064650.001.0001

Cowell, 1999, Probabilistic Networks and Expert Systems

Daniels, 1999, Nonconjugate Bayesian estimation of covariance matrices and its use in hierarchical models, J. Am. Statist. Ass., 94, 1254, 10.1080/01621459.1999.10473878

2001, Shrinkage estimators for covariance matrices, Biometrics, 57, 1173, 10.1111/j.0006-341X.2001.01173.x

Dawid, 1984, Statistical theory: the prequential approach, J. R. Statist. Soc., 147, 278

Kotz, 1986, Probability forecasting, Encyclopedia of Statistical Sciences, 210

1991, Fisherian inference in likelihood and prequential frames of reference (with discussion), J. R. Statist. Soc., 53, 79

Bernardo, 1992, Bayesian Statistics 4, 109, 10.1093/oso/9780198522669.001.0001

Ghosh, 1992, Current Issues in Statistical Inference: Essays in Honor of D. Basu, 113

Draper, 1999, Bayesian Statistics 6, 541

Draper, 2000, A case study of stochastic optimization in health policy: problem formulation and preliminary results, J. Global Optimzn, 18, 399, 10.1023/A:1026504402220

Dupuis, 2002, Model choice in qualitative regression models, J. Statist. Planng Inf.

Efron, 1986, How biased is the apparent error rate of a prediction rule?, J. Am. Statist. Ass., 81, 461, 10.1080/01621459.1986.10478291

Fouskakis, 2002, Stochastic optimization: a review, Int. Statist. Rev., 10.1111/j.1751-5823.2002.tb00174.x

Gangnon, 2002, Spatial Cluster Modelling

Gelfand, 1996, Markov Chain Monte Carlo in Practice, 145

Gelfand, 1992, Bayesian Statistics 4, 147, 10.1093/oso/9780198522669.003.0009

Gelman, 1996, Posterior predictive assessment of model fitness via realized discrepancies (with discussion), Statist. Sin., 6, 733

Good, 1952, Rational decisions, J. R. Statist. Soc., 14, 107

Green, 2002, Hidden Markov models and disease mapping, J. Am. Statist. Ass., 10.1198/016214502388618870

Hodges, Counting degrees of freedom in hierarchical and other richly-parameterised models, Biometrika, 88, 367, 10.1093/biomet/88.2.367

Holmes, 1999, Bayesian wavelet analysis with a model complexity prior, Bayesian Statistics 6, 769, 10.1093/oso/9780198504856.003.0037

Kass, Bayes factors and model uncertainty, J. Am. Statist. Ass., 90, 773, 10.1080/01621459.1995.10476572

Key, 1999, Bayesian Statistics 6, 343, 10.1093/oso/9780198504856.003.0015

King, 2001, Bayesian model discrimination in the analysis of capture-recapture and related data

King, 2001, Bayesian estimation of census undercount, Biometrika, 88, 317, 10.1093/biomet/88.2.317

Konishi, 1996, Generalised information criteria in model selection, Biometrika, 83, 875, 10.1093/biomet/83.4.875

Lauritzen, 1988, Local computations with probabilities on graphical structures and their application to expert systems (with discussion), J. R. Statist. Soc., 50, 157

Lawson, 2000, Cluster modelling of disease incidence via rjmcmc methods: a comparative evaluation, Statist. Med., 19, 2361, 10.1002/1097-0258(20000915/30)19:17/18<2361::AID-SIM575>3.0.CO;2-N

Lee, 1996, Hierarchical generalized linear models (with discussion), J. R. Statist. Soc., 58, 619

2001, Hierarchical generalized linear models: a synthesis of generalized linear models, random effect models and structured dispersions, Biometrika, 88, 987, 10.1093/biomet/88.4.987

2001, Modelling and analysing correlated non-normal data, Statist. Modlng, 1, 3, 10.1177/1471082X0100100102

Luna, 2003, Choosing a model selection strategy, Scand. J. Statist.

Madigan, 1991, Model selection and accounting for model uncertainty in graphical models using Occam's window

McKeague, 2002, Spatial Cluster Modelling

Meng, Performing likelihood ratio tests with multiply imputed data sets, Biometrika, 79, 103, 10.1093/biomet/79.1.103

Moreno, 1998, Decision Research from Bayesian Approaches to Normative Systems

Neter, 1996, Applied Linear Statistical Models

Pericchi, 1991, Robust Bayesian credible intervals and prior ignorance, Int. Statist. Rev., 58, 1, 10.2307/1403571

Plummer, 2002, Some criteria for Bayesian model choice

Priestley, 1981, Spectral Analysis and Time Series

Robert, 1996, Intrinsic loss functions, Theory Decsn, 40, 191, 10.1007/BF00133173

Shao, 1997, An asymptotic theory for linear model selection, Statist. Sin., 7, 221

Skouras, 1999, On efficient probability forecasting systems, Biometrika, 86, 765, 10.1093/biomet/86.4.765

2000, Consistency in misspecified models

Smith, 1996, Bayesian Statistics 5, 387, 10.1093/oso/9780198523567.003.0020

Stone, 1974, Cross-validatory choice and assessment of statistical predictions (with discussion), J. R. Statist. Soc., 36, 111

1977, An asymptotic equivalence of choice of model by cross-validation and Akaike's criterion, J. R. Statist. Soc., 36, 44

Vehtari, 2001, Bayesian model assessment and selection using expected utilities

Vehtari, 2002, Bayesian model assessment and comparison using cross-validation predictive densities, Neur. Computn, 14

2002, Cross-validation, information criteria, expected utilities and the effective number of parameters

Volinsky, 2000, Bayesian information criterion for censored survival models, Biometrics, 56, 256, 10.1111/j.0006-341X.2000.00256.x

Weisberg, 1981, A statistic for allocating Cp to individual cases, Technometrics, 23, 27

Ye, On measuring and correcting the effects of data mining and model selection, J. Am. Statist. Ass., 93, 120, 10.1080/01621459.1998.10474094

Zhu, Comparing hierarchical models for spatio-temporally misaligned data using the deviance information criterion, Statist. Med., 19, 2265, 10.1002/1097-0258(20000915/30)19:17/18<2265::AID-SIM568>3.0.CO;2-6