Minimal Penalties for Gaussian Model Selection

Lucien Birgé1, Pascal Massart2
1UMR 7599 “Probabilités et modèles aléatoires”, Laboratoire de Probabilités, boîte 188, Université Paris VI, 4, place Jussieu, 75252 Paris cedex 05, France
2UMR 8628 “Laboratoire de Mathématiques”, Bât. 425, Université Paris Sud, Campus d’Orsay, 91405, Orsay Cedex, France

Tóm tắt

Từ khóa


Tài liệu tham khảo

Abramovich, F., Benjamini, Y., Donoho, D.L., Johnstone, I.M.: Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34, (2006)

Akaike H. (1969). Statistical predictor identification. Ann. Inst. Statist. Math. 22:203–217

Akaike H. (1973). Information theory and an extension of the maximum likelihood principle. In: Petrov P.N., Csaki F. (eds) Proceedings 2nd International Symposium on Information Theory. Akademia Kiado, Budapest, pp. 267–281

Akaike H. (1974). A new look at the statistical model identification. IEEE Trans. Autom. Control 19:716–723

Akaike H. A Bayesian analysis of the minimum AIC procedure. Ann. Inst. Statist. Math. 30, Part A, 9–14 (1978)

Amemiya T. (1985). Advanced Econometrics. Basil Blackwell, Oxford

Barron A.R., Birgé L., Massart P. (1999). Risk bounds for model selection via penalization. Probab. Theory Relat. Fields 113:301–415

Barron A.R., Cover T.M. (1991). Minimum complexity density estimation. IEEE Trans. Inf. Theory 37:1034–1054

Birgé, L.: An alternative point of view on Lepski’s method. In: de Gunst, M.C.M., Klaassen, C.A.J., van der Vaart, A.W. (eds.) State of the Art in Probability and Statistics, Festschrift for Willem R. van Zwet, Institute of Mathematical Statistics, Lecture Notes–Monograph Series, Vol. 36. 113–133 (2001)

Birgé L., Massart P. (1998). Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4:329–375

Birgé L., Massart P. (2001). Gaussian model selection. J. Eur. Math. Soc. 3:203–268

Birgé, L., Massart, P.: A generalized C p criterion for Gaussian model selection. Technical Report No 647. Laboratoire de Probabilités, Université Paris VI (2001) http://www.proba. jussieu.fr/mathdoc/preprints/index.html#2001

Daniel C., Wood F.S. (1971). Fitting Equations to Data. Wiley, New York

Draper N.R., Smith H. (1981). Applied Regression Analysis, 2nd edn. Wiley, New York

Efron B., Hastie R., Johnstone I.M., Tibshirani R. (2004). Least angle regression. Ann. Statist. 32:407–499

Feller W. (1968). An Introduction to Probability Theory and its Applications, Vol I (3rd edn). Wiley, New York

George E.I., Foster D.P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87:731–747

Gey S., Nédélec E. (2005). Model selection for CART regression trees. IEEE Trans. Inf. Theory 51:658–670

Guyon X., Yao J.F. (1999). On the underfitting and overfitting sets of models chosen by order selection criteria. Jour. Multivar. Anal. 70:221–249

Hannan E.J., Quinn B.G. (1979). The determination of the order of an autoregression. J.R.S.S., B 41:190–195

Hoeffding W. (1963). Probability inequalities for sums of bounded random variables. J.A.S.A. 58:13–30

Hurvich K.L., Tsai C.-L. (1989). Regression and time series model selection in small samples. Biometrika 76:297–307

Johnstone, I.: Chi-square oracle inequalities. In: de Gunst, M.C.M., Klaassen, C.A.J. van der Vaart, A.W. (eds.) State of the Art in Probability and Statistics, Festschrift for Willem R. van Zwet, Institute of Mathematical Statistics, Lecture Notes–Monograph Series, Vol. 36. pp. 399–418 (2001)

Kneip A. (1994). Ordered linear smoothers. Ann. Statist. 22:835–866

Lavielle M., Moulines E. (2000). Least Squares estimation of an unknown number of shifts in a time series. J. Time Series Anal. 21:33–59

Lebarbier E. (2005). Detecting multiple change-points in the mean of a Gaussian process by model selection. Signal Proces. 85:717–736

Li K.C. (1987). Asymptotic optimality for C p , C L , cross-validation, and generalized cross-validation: Discrete index set. Ann. Statist. 15:958–975

Loubes, J.-M., Massart, P.: Discussion of “Least angle regression” by Efron, B., Hastie, R., Johnstone, I., Tibshirani, R. Ann. Statist. 32, 460–465 (2004).

Mallows C.L. (1973). Some comments on C p . Technometrics 15:661–675

Massart P. (1990). The tight constant in the D.K.W. inequality. Ann. Probab. 18:1269–1283

McQuarrie A.D.R., Tsai C.-L. (1998). Regression and Time Series Model Selection. World Scientific, Singapore

Mitchell T.J., Beauchamp J.J. (1988). Bayesian variable selection in linear regression. J.A.S.A. 83:1023–1032

Polyak B.T., Tsybakov A.B. (1990). Asymptotic optimality of the C p -test for the orthogonal series estimation of regression. Theory Probab. Appl. 35:293–306

Rissanen J. (1978). Modeling by shortest data description. Automatica 14:465–471

Schwarz G. (1978). Estimating the dimension of a model. Ann. Statist. 6:461–464

Shen X., Ye J. (2002). Adaptive model selection. J.A.S.A. 97:210–221

Shibata R. (1981). An optimal selection of regression variables. Biometrika 68:45–54

Wallace D.L. (1959). Bounds on normal approximations to Student’s and the chi-square distributions. Ann. Math. Stat. 30:1121–1130

Whittaker E.T., Watson G.N. (1927). A Course of Modern Analysis. Cambridge University Press, London

Yang Y. (2005). Can the strenghths of AIC and BIC be shared? A conflict between model identification and regression estimation. Biometrika 92:937–950

Yao Y.C. (1988). Estimating the number of change points via Schwarz criterion. Stat. Probab. Lett. 6:181–189