Least angle regression

Annals of Statistics - Tập 32 Số 2 - 2004
Bradley Efron1, Trevor Hastie1, Iain M. Johnstone1, Robert Tibshirani1
1Department of Statistics, Stanford University

Tóm tắt

Từ khóa


Tài liệu tham khảo

Breiman, L. (1996). Bagging predictors. <i>Machine Learning</i> <b>24</b> 123--140.

Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting (with discussion). <i>Ann. Statist.</i> <b>28</b> 337--407.

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. <i>J. Roy. Statist. Soc. Ser. B</i> <b>57</b> 289--300.

Breiman, L. (1999). Prediction games and arcing algorithms. <i>Neural Computation</i> <b>11</b> 1493--1517.

Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. <i>J. Comput. System Sci.</i> <b>55</b> 119--139.

Bühlmann, P. and Yu, B. (2002). Analyzing bagging. <i>Ann. Statist.</i> <b>30</b> 927--961.

Abramovich, F., Benjamini, Y., Donoho, D. and Johnstone, I. (2000). Adapting to unknown sparsity by controlling the false discovery rate. Technical report 2000-19, Dept. Statistics, Stanford Univ.

Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage. <i>Biometrika</i> <b>81</b> 425--455.

Ishwaran, H. and Rao, J. S. (2000). Bayesian nonparametric MCMC for large variable selection problems. Unpublished manuscript.

Shao, J. (1993). Linear model selection by cross-validation. <i>J. Amer. Statist. Assoc.</i> <b>88</b> 486--494.

Shao, J. (1993). Linear model selection by cross-validation. <i>J. Amer. Statist. Assoc.</i> <b>88</b> 486--494.

Birgé, L. and Massart, P. (2001). Gaussian model selection. <i>J. Eur. Math. Soc.</i> <b>3</b> 203--268.

Ye, J. (1998). On measuring and correcting the effects of data mining and model selection. <i>J. Amer. Statist. Assoc.</i> <b>93</b> 120--131.

Foster, D. P. and George, E. I. (1994). The risk inflation criterion for multiple regression. <i>Ann. Statist.</i> <b>22</b> 1947--1975.

Osborne, M. R., Presnell, B. and Turlach, B. A. (2000). A new approach to variable selection in least squares problems. <i>IMA J. Numer. Anal.</i> <b>20</b> 389--403.

Stein, C. (1981). Estimation of the mean of a multivariate normal distribution. <i>Ann. Statist.</i> <b>9</b> 1135--1151.

Rao, C. R. (1973). <i>Linear Statistical Inference and Its Applications</i>, 2nd ed. Wiley, New York.

McCullagh, P. and Nelder, J. A. (1989). <i>Generalized Linear Models</i>, 2nd ed. Chapman and Hall, London.

Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). <i>Classification and Regression Trees</i>. Wadsworth, Belmont, CA.

Cook, R. D. (1998). <i>Regression Graphics</i>. Wiley, New York.

Efron, B. (1986). How biased is the apparent error rate of a prediction rule? <i>J. Amer. Statist. Assoc.</i> <b>81</b> 461--470.

Efron, B. and Tibshirani, R. (1997). Improvements on cross-validation: The $.632+$ bootstrap method. <i>J. Amer. Statist. Assoc.</i> <b>92</b> 548--560.

Freund, Y. and Schapire, R. (1997). A decision-theoretic generalization of online learning and an application to boosting. <i>J. Comput. System Sci.</i> <b>55</b> 119--139.

Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. <i>Ann. Statist.</i> <b>29</b> 1189--1232.

Golub, G. and Van Loan, C. (1983). <i>Matrix Computations</i>. Johns Hopkins Univ. Press, Baltimore, MD.

Hastie, T., Tibshirani, R. and Friedman, J. (2001). <i>The Elements of Statistical Learning</i>:<i> Data Mining</i>,<i> Inference and Prediction</i>. Springer, New York.

Lawson, C. and Hanson, R. (1974). <i>Solving Least Squares Problems</i>. Prentice-Hall, Englewood Cliffs, NJ.

Mallows, C. (1973). Some comments on $C_p$. <i>Technometrics</i> <b>15</b> 661--675.

Mallows, C. (1973). Some comments on $C_p$. <i>Technometrics</i> <b>15</b> 661--675.

Mallows, C. (1973). Some comments on $C_p$. <i>Technometrics</i> <b>15</b> 661--675.

Meyer, M. and Woodroofe, M. (2000). On the degrees of freedom in shape-restricted regression. <i>Ann. Statist.</i> <b>28</b> 1083--1104.

Osborne, M., Presnell, B. and Turlach, B. (2000a). A new approach to variable selection in least squares problems. <i>IMA J. Numer. Anal.</i> <b>20</b> 389--403.

Osborne, M. R., Presnell, B. and Turlach, B. (2000b). On the LASSO and its dual. <i>J. Comput. Graph. Statist.</i> <b>9</b> 319--337.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. <i>J. Roy. Statist. Soc. Ser. B.</i> <b>58</b> 267--288.

Weisberg, S. (1980). <i>Applied Linear Regression</i>. Wiley, New York.

Breiman, L. (1992). The little bootstrap and other methods for dimensionality selection in regression: $X$-fixed prediction error. <i>J. Amer. Statist. Assoc.</i> <b>87</b> 738--754.

George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. <i>J. Amer. Statist. Assoc.</i> <b>88</b> 881--889.

Ishwaran, H. and Rao, J. S. (2003). Detecting differentially expressed genes in microarrays using Bayesian model selection. <i>J. Amer. Statist. Assoc.</i> <b>98</b> 438--455.

Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression (with discussion). <i>J. Amer. Statist. Assoc.</i> <b>83</b> 1023--1036.

Abramovich, F., Benjamini, Y., Donoho, D. and Johnstone, I. (2000). Adapting to unknown sparsity by controlling the false discovery rate. Technical Report 2000--19, Dept. Statistics, Stanford Univ.

Akaike, H. (1973). Maximum likelihood identification of Gaussian autoregressive moving average models. <i>Biometrika</i> <b>60</b> 255--265.

Birgé, L. and Massart, P. (2001a). Gaussian model selection. <i>J. Eur. Math. Soc.</i> <b>3</b> 203--268.

Birgé, L. and Massart, P. (2001b). A generalized $C_p$ criterion for Gaussian model selection. Technical Report 647, Univ. Paris 6 &amp; 7.

Foster, D. and George, E. (1994). The risk inflation criterion for multiple regression. <i>Ann. Statist.</i> <b>22</b> 1947--1975.

Knight, K. and Fu, B. (2000). Asymptotics for Lasso-type estimators. <i>Ann. Statist.</i> <b>28</b> 1356--1378.

Loubes, J.-M. and van de Geer, S. (2002). Adaptive estimation with soft thresholding penalties. <i>Statist. Neerlandica</i> <b>56</b> 453--478.

van de Geer, S. (2001). Least squares estimation with complexity penalties. <i>Math. Methods Statist.</i> <b>10</b> 355--374.

Breiman, L. (2001). Random forests. Available at ftp://ftp.stat.berkeley.edu/pub/users/breiman/ randomforest2001.pdf.

Fu, W. J. (1998). Penalized regressions: The Bridge versus the Lasso. <i>J. Comput. Graph. Statist.</i> <b>7</b> 397--416.

Ridgeway, G. (2003). GBM 0.7-2 package manual. Available at http://cran.r-project.org/doc/ packages/gbm.pdf.

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. <i>Ann. Statist.</i> <b>29</b> 1189--1232.

Mason, L., Baxter, J., Bartlett, P. and Frean, M. (2000). Boosting algorithms as gradient descent. In <i>Advances in Neural Information Processing Systems</i> <b>12</b> 512--518. MIT Press, Cambridge, MA.

Rosset, S. and Zhu, J. (2004). Piecewise linear regularized solution paths. <i>Advances in Neural Information Processing Systems</i> <b>16</b>. To appear.

Rosset, S., Zhu, J. and Hastie, T. (2003). Boosting as a regularized path to a maximum margin classifier. Technical report, Dept. Statistics, Stanford Univ.

Zhu, J., Rosset, S., Hastie, T. and Tibshirani, R. (2004). 1-norm support vector machines. <i>Neural Information Processing Systems</i> <b>16</b>. To appear.

Blake, C. and Merz, C. (1998). UCI repository of machine learning databases. Technical report, School Information and Computer Science, Univ. California, Irvine. Available at www.ics.uci.edu/~mlearn/MLRepository.html.

Foster, D. P. and Stine, R. A. (1996). Variable selection via information theory. Technical Report Discussion Paper 1180, Center for Mathematical Studies in Economics and Management Science, Northwestern Univ.

Breiman, L. (1995). Better subset regression using the nonnegative garrote. <i>Technometrics</i> <b>37</b> 373--384.

Moore, D. S. and McCabe, G. P. (1999). <i>Introduction to the Practice of Statistics</i>, 3rd ed. Freeman, New York.

Nelder, J. A. (1977). A reformulation of linear models (with discussion). <i>J. Roy. Statist. Soc. Ser. A</i> <b>140</b> 48--76.

Nelder, J. A. (1994). The statistics of linear models: Back to basics. <i>Statist. Comput.</i> <b>4</b> 221--234.

Cook, R. D. and Weisberg, S. (1999a). <i>Applied Regression Including Computing and Graphics.</i> Wiley, New York.

Cook, R. D. and Weisberg, S. (1999b). Graphs in statistical analysis: Is the medium the message? <i>Amer. Statist.</i> <b>53</b> 29--37.

Efron, B. (2001). Discussion of ``Statistical modeling: The two cultures,'' by L. Breiman. <i>Statist. Sci.</i> <b>16</b> 218--219.

Li, K. C. (1991). Sliced inverse regression for dimension reduction (with discussion). <i>J. Amer. Statist. Assoc.</i> <b>86</b> 316--342.

Weisberg, S. (1981). A statistic for allocating $C_p$ to individual cases. <i>Technometrics</i> <b>23</b> 27--31.

Weisberg, S. (2002). Dimension reduction regression in R. <i>J. Statistical Software</i> <b>7</b>. (On-line journal available at www.jstatsoft.org. The software is available from cran.r-project.org.)

Efron, B. (2004). The estimation of prediction error: Covariance penalties and cross-validation. <i>J. Amer. Statist. Assoc.</i> To appear.

Foster, D. and Stine, R. (1997). An information theoretic comparison of model selection criteria. Technical report, Dept. Statistics, Univ. Pennsylvania.

George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. <i>Biometrika</i> <b>87</b> 731--747.

Leblanc, M. and Tibshirani, R. (1998). Monotone shrinkage of trees. <i>J. Comput. Graph. Statist.</i> <b>7</b> 417--433.