Regression Shrinkage and Selection Via the Lasso

Robert Tibshirani1
1University of Toronto, Canada

Tóm tắt

SUMMARY We propose a new method for estimation in linear models. The ‘lasso’ minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

Từ khóa


Tài liệu tham khảo

Breiman, 1993, Better subset selection using the non-negative garotte, Technical Report.

Breiman, 1984, Classification and Regression Trees.

Breiman, 1992, Submodel selection and evaluation in regression: the x-random case, Int. Statist. Rev., 60, 291, 10.2307/1403680

Chen, 1994, Basis pursuit, 28th Asilomar Conf. Signals, Systems Computers, Asilomar.

Donoho, 1994, Ideal spatial adaptation by wavelet shrinkage, Biometrika, 81, 425, 10.1093/biomet/81.3.425

Donoho, 1992, Maximum entropy and the nearly black object (with discussion), J. R. Statist. Soc. B, 54, 41, 10.1111/j.2517-6161.1992.tb01864.x

Donoho, 1995, Wavelet shrinkage; asymptopia?, J. R. Statist. Soc. B, 57, 301, 10.1111/j.2517-6161.1995.tb02032.x

Efron, 1993, An Introduction to the Bootstrap., 10.1007/978-1-4899-4541-9

Frank, 1993, A statistical view of some chemometrics regression tools (with discussion), Technometrics, 35, 109, 10.1080/00401706.1993.10485033

Friedman, 1991, Multivariate adaptive regression splines (with discussion), Ann. Statist., 19, 1

George, 1993, Variable selection via gibbs sampling, J. Am. Statist. Ass., 88, 884, 10.1080/01621459.1993.10476353

Hastie, 1990, Generalized Additive Models.

Lawson, 1974, Solving Least Squares Problems.

LeBlanc, 1994, Monotone shrinkage of trees, Technical Report.

Murray, 1981, Practical Optimization.

Shao, 1992, Linear model selection by cross-validation, J. Am. Statist. Ass., 88, 486, 10.1080/01621459.1993.10476299

Stamey, 1989, Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate, ii: Radical prostatectomy treated patients, J. Urol., 16, 1076, 10.1016/S0022-5347(17)41175-X

Stein, 1981, Estimation of the mean of a multivariate normal distribution, Ann. Statist., 9, 1135, 10.1214/aos/1176345632

Tibshirani, 1994, A proposal for variable selection in the cox model, Technical Report.

Zhang, 1993, Model selection via multifold cv, Ann. Statist., 21, 299, 10.1214/aos/1176349027