Fast Stable Direct Fitting and Smoothness Selection for Generalized Additive Models

Simon N. Wood1
1University of Bath, UK

Tóm tắt

SummaryExisting computationally efficient methods for penalized likelihood generalized additive model fitting employ iterative smoothness selection on working linear models (or working mixed models). Such schemes fail to converge for a non-negligible proportion of models, with failure being particularly frequent in the presence of concurvity. If smoothness selection is performed by optimizing ‘whole model’ criteria these problems disappear, but until now attempts to do this have employed finite-difference-based optimization schemes which are computationally inefficient and can suffer from false convergence. The paper develops the first computationally efficient method for direct generalized additive model smoothness selection. It is highly stable, but by careful structuring achieves a computational efficiency that leads, in simulations, to lower mean computation times than the schemes that are based on working model smoothness selection. The method also offers a reliable way of fitting generalized additive mixed models.

Từ khóa


Tài liệu tham khảo

Akaike, 1973, Proc. 2nd Int. Symp Information Theory, 267

Anderson, 1999, LAPACK Users’ Guide, 10.1137/1.9780898719604

Borchers, 1997, Improving the precision of the daily egg production method using generalized additive models, Can. J. Fish. Aquat. Sci., 54, 2727, 10.1139/f97-134

Bowman, 1997, Applied Smoothing Techniques for Data Analysis, 10.1093/oso/9780198523963.001.0001

Breslow, 1993, Approximate inference in generalized linear mixed models, J. Am. Statist. Ass., 88, 9

Brezger, 2007, BayesX 1.5.0. University of Munich

Brezger, 2006, Generalized structured additive regression based on Bayesian P-splines, Computnl Statist. Data Anal., 50, 967, 10.1016/j.csda.2004.10.011

Cline, 1979, An estimate for the condition number of a matrix. SIAM J, Numer. Anal., 13, 293, 10.1137/0713027

Craven, 1979, Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross validation, Numer. Math., 31, 377, 10.1007/BF01404567

Dennis, 1983, Numerical Methods for Unconstrained Optimization and Nonlinear Equations

Dongarra, 1978, LINPACK Users Guide

Eilers, 2002, Generalized linear additive smooth structures, J. Computnl Graph. Statist., 11, 758, 10.1198/106186002844

Fahrmeir, 2004, Penalized structured additive regression for space time data: a Bayesian perspective, Statist. Sin., 14, 731

Fahrmeir, 2001, Bayesian inference for generalized additive mixed models based on Markov random field priors, Appl. Statist., 50, 201

Figueiras, 2005, A bootstrap method to avoid the effect of concurvity in generalized additive models in time series studies of air pollution, J. Epidem. Commty Hlth, 59, 881, 10.1136/jech.2004.026740

Gill, 1981, Practical Optimization

Golub, 1996, Matrix Computations

Green, 1994, Nonparametric Regression and Generalized Linear Models, 10.1007/978-1-4899-4473-3

Gu, 1992, Cross validating non-Gaussian data, J. Computnl Graph. Statist., 1, 169

Gu, 2002, Smoothing Spline ANOVA Models, 10.1007/978-1-4757-3683-0

Gu, 2004, General smoothing splines

Gu, 1991, Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method, SIAM J. Scient. Statist. Comput., 12, 383, 10.1137/0912021

Gu, 2001, Cross-validating non-Gaussian data: generalized approximate cross-validation revisited, J. Computnl Graph. Statist., 10, 581, 10.1198/106186001317114992

Hastie, 1986, Generalized additive models (with discussion), Statist. Sci., 1, 297

Hastie, 1990, Generalized Additive Models

Hastie, 1993, Varying-coefficient models, J. R. Statist. Soc. B, 55, 757

Kim, 2004, Smoothing spline Gaussian regression: more scalable computation via efficient approximation, J. R. Statist. Soc. B, 66, 337, 10.1046/j.1369-7412.2003.05316.x

Lang, 2004, Bayesian P-splines, J. Computnl Graph. Statist., 13, 183, 10.1198/1061860043010

Lin, 1999, Inference in generalized additive mixed models by using smoothing splines, J. R. Statist. Soc. B, 61, 381, 10.1111/1467-9868.00183

Mallows, 1973, Some comments on Cp, Technometrics, 15, 661

Mammen, 2005, Bandwidth selection for smooth backfitting in additive models, Ann. Statist., 33, 1260, 10.1214/009053605000000101

Marx, 1998, Direct generalized additive modeling with penalized likelihood, Computnl Statist. Data Anal., 28, 193, 10.1016/S0167-9473(98)00033-4

McCullagh, 1989, Generalized Linear Models, 10.1007/978-1-4899-3242-6

Nelder, 1972, Generalized linear models, J. R. Statist. Soc. A, 135, 370, 10.2307/2344614

O'Sullivan, 1986, A statistical perspective on ill-posed inverse problems, Statist. Sci., 1, 502

O'Sullivan, 1986, Automatic smoothing of regression functions in generalized linear models, J. Am. Statist. Ass., 81, 96, 10.1080/01621459.1986.10478243

Parker, 1985, Discussion on ‘Some aspects of the spline smoothing approach to non-parametric regression curve fitting’ (by B. W. Silverman), J. R. Statist. Soc. B, 47, 40

Pinheiro, 2000, Mixed-effects Models in S and S-PLUS, 10.1007/978-1-4419-0318-1

Ramsay, 2003, Exploring bias in a generalized additive model for spatial air pollution data, Environ. Hlth Perspect., 111, 1283, 10.1289/ehp.6047

R Development Core Team, 2006, R 2.4.0: a Language and Environment for Statistical Computing

Ruppert, 2003, Semiparametric Regression, 10.1017/CBO9780511755453

Skaug, 2006, Automatic approximation of the marginal likelihood in non-Gaussian hierarchical models, Computnl Statist. Data Anal., 51, 699, 10.1016/j.csda.2006.03.005

Stone, 1977, An asymptotic equivalence of choice of model by cross-validation and Akaike's criterion, J. R. Statist. Soc. B, 39, 44

Wahba, 1980, Spline bases, regularization, and generalized cross validation for solving approximation problems with large quantities of noisy data. In, Approximation Theory III

Wahba, 1990, Spline Models for Observational Data, 10.1137/1.9781611970128

Watkins, 1991, Fundamentals of Matrix Computations

Wood, 2000, Modelling and smoothing parameter estimation with multiple quadratic penalties, J. R. Statist. Soc. B, 62, 413, 10.1111/1467-9868.00240

Wood, 2003, Thin plate regression splines, J. R. Statist. Soc. B, 65, 95, 10.1111/1467-9868.00374

Wood, 2004, Stable and efficient multiple smoothing parameter estimation for generalized additive models, J. Am. Statist. Ass., 99, 673, 10.1198/016214504000000980

Wood, 2006, Generalized Additive Models: an Introduction with R, 10.1201/9781420010404

Xiang, 1996, A generalized approximate cross validation for smoothing splines with non-Gaussian data, Statist. Sin., 6, 675