Sparse Additive Models

Pradeep Ravikumar1, John Lafferty2, Han Liu2, Larry Wasserman2
1University of California, Berkeley, USA.
2Carnegie Mellon University , Pittsburgh , USA

Tóm tắt

SummaryWe present a new class of methods for high dimensional non-parametric regression and classification called sparse additive models. Our methods combine ideas from sparse linear modelling and additive non-parametric regression. We derive an algorithm for fitting the models that is practical and effective even when the number of covariates is larger than the sample size. Sparse additive models are essentially a functional version of the grouped lasso of Yuan and Lin. They are also closely related to the COSSO model of Lin and Zhang but decouple smoothing and sparsity, enabling the use of arbitrary non-parametric smoothers. We give an analysis of the theoretical properties of sparse additive models and present empirical results on synthetic and real data, showing that they can be effective in fitting sparse non-parametric models in high dimensional data.

Từ khóa


Tài liệu tham khảo

Antoniadis, 2001, Regularized wavelet approximations (with discussion), J. Am. Statist. Ass., 96, 939, 10.1198/016214501753208942

Buja, 1989, Linear smoothers and additive models, Ann. Statist., 17, 453

Bunea, 2007, Sparsity oracle inequalities for the lasso, Electron. J. Statist., 1, 169, 10.1214/07-EJS008

Daubechies, 2004, An iterative thresholding algorithm for linear inverse problems, Communs Pure Appl. Math., 57, 1413, 10.1002/cpa.20042

Daubechies, 2007, Accelerated projected gradient method for linear inverse problems with sparsity constraints

Fan, 2005, Nonparametric inference for additive models, J. Am. Statist. Ass., 100, 890, 10.1198/016214504000001439

Fan, 2001, Variable selection via penalized likelihood, J. Am. Statist. Ass., 96, 1348, 10.1198/016214501753382273

Greenshtein, 2004, Persistency in high dimensional linear predictor-selection and the virtue of over-parametrization, Bernoulli, 10, 971, 10.3150/bj/1106314846

Hastie, 1999, Generalized Additive Models

Juditsky, 2000, Functional aggregation for nonparametric regression, Ann. Statist., 28, 681, 10.1214/aos/1015951994

Koltchinskii, 2008, Proc. 21st A. Conf. Learning Theory, 229

Ledoux, 1991, Probability in Banach Spaces: Isoperimetry and Processes, 10.1007/978-3-642-20212-4

Lin, 2006, Component selection and smoothing in multivariate nonparametric regression, Ann. Statist., 34, 2272, 10.1214/009053606000000722

Meier, 2008, High-dimensional additive modelling

Meinshausen, 2006, High dimensional graphs and variable selection with the lasso, Ann. Statist., 34, 1436, 10.1214/009053606000000281

Meinshausen, 2006, Lasso-type recovery of sparse representations for high-dimensional data, 10.21236/ADA472998

Olshausen, 1996, Emergence of simple-cell receptive field properties by learning a sparse code for natural images, Nature, 381, 607, 10.1038/381607a0

Ravikumar, 2008, Advances in Neural Information Processing Systems, 1201

Tibshirani, 1996, Regression shrinkage and selection via the lasso, J. R. Statist. Soc. B, 58, 267

Van Der Vaart, 1998, Asymptotic Statistics, 10.1017/CBO9780511802256

Wainwright, 2006, Sharp thresholds for high-dimensional and noisy recovery of sparsity

Wainwright, 2007, Advances in Neural Information Processing Systems, 1465

Wasserman, 2007, Multi-stage variable selection: screen and clean

Yuan, 2007, Nonnegative garrote component selection in functional ANOVA models, Proc. Artif. Intell. Statist.

Yuan, 2006, Model selection and estimation in regression with grouped variables, J. R. Statist. Soc. B, 68, 49, 10.1111/j.1467-9868.2005.00532.x

Zhao, 2007, On model selection consistency of lasso, J. Mach. Learn. Res., 7, 2541

Zou, 2005, The adaptive lasso and its oracle properties, J. Am. Statist. Ass., 101, 1418, 10.1198/016214506000000735