The composite absolute penalties family for grouped and hierarchical variable selection

Annals of Statistics - Tập 37 Số 6A - 2009
Peng Zhao1,2, Guilherme V. Rocha1,2, Bin Yu1,2
1DEPARTMENT OF STATISTICS UNIVERSITY OF CALIFORNIA, BERKELEY 367 EVANS HALL BERKELEY, CALIFORNIA 94720 USA
2University of California, Berkeley

Tóm tắt

Từ khóa


Tài liệu tham khảo

[22] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. <i>J. Roy. Statist. Soc. Ser. B</i> <b>58</b> 267–288.

[3] Breiman, L. (1995). Better subset regression using the nonnegative garrote. <i>Technometrics</i> <b>37</b> 373–384.

[4] Chen, S., Donoho, D. and Saunders, M. (2001). Atomic decomposition by basis pursuit. <i>SIAM Rev.</i> <b>43</b> 129–159.

[5] Donoho, D. and Johnstone, I. (1994). Ideal spatial adaptation by wavelet shrinkage. <i>Biometrika</i> <b>81</b> 425–455.

[7] Efron, B. (2004). The estimation of prediction error covariance penalties and cross-validation. <i>J. Amer. Statist. Assoc.</i> <b>99</b> 619–632.

[8] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. <i>Ann. Statist.</i> <b>35</b> 407–499.

[9] Frank, I. E. and Friedman, J. (1993). A statistical view of some chemometrics regression tools. <i>Technometrics</i> <b>35</b> 109–148.

[10] Freund, Y. and Schapire, R. E. (1997). A decision theoretic generalization of online learning and an application to boosting. <i>J. Comput. System Sci.</i> <b>55</b> 119–139.

[11] Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D. and Lander, E. S. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. <i>Science</i> <b>286</b> 531–537.

[12] Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation of nonorthogonal problems. <i>Technometrics</i> <b>12</b> 55–67.

[14] Kim, Y., Kim, J. and Kim, Y. (2006). Blockwise sparse regression. <i>Statist. Sinica</i> <b>16</b> 375–390.

[15] Mallows, C. L. (1973). Some comments on <i>c</i><sub><i>p</i></sub>. <i>Technometrics</i> <b>15</b> 661–675.

[17] Osborne, M., Presnell, B. and Turlach, B. (2000). A new approach to variable selection in least square problems. <i>IMA J. Numer. Anal.</i> <b>20</b> 389–404.

[18] Rosset, S. and Zhu, J. (2007). Piecewise linear regularized solution paths. <i>Ann. Statist.</i> <b>35</b> 1012–1030.

[19] Schwartz, G. (1978). Estimating the dimension of a model. <i>Ann. Statist.</i> <b>6</b> 461–464.

[20] Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. <i>J. Roy. Statist. Soc. Ser. B Methodol.</i> <b>36</b> 111–147.

[21] Sugiura, N. (1978). Further analysis of the data by Akaike’s information criterion and finite corrections. <i>Comm. Statist.</i> <b>A7</b> 13–26.

[23] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. <i>J. Roy. Statist. Soc. Ser. B</i> <b>68</b> 49–67.

[24] Zhao, P. and Yu, B. (2007). Stagewise Lasso. <i>J. Mach. Learn. Res.</i> <b>8</b> 2701–2726.

[26] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. <i>J. Roy. Statist. Soc. Ser. B</i> <b>67</b> 301–320.

[27] Zou, H., Hastie, T. and Tibshirani, R. (2007). On the “degrees of freedom” of the Lasso. <i>Ann. Statist.</i> <b>35</b> 2173–2192.

[1] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In <i>Proc. 2nd International Symposium on Information Theory</i> 267–281.

[2] Boyd, S. and Vandenberghe, L. (2004). <i>Convex Optimization</i>. Cambridge Univ. Press, Cambridge.

[6] Efron, B. (1982). <i>The Jackknife, the Bootstrap and Other Resampling Plans</i>. SIAM, Philadelphia.

[13] Kaufman, L. and Rousseeuw, P. J. (1990). <i>Finding Groups in Data: An Introduction to Cluster Analysis</i>. Wiley, New York.

[16] Obozinski, G. and Jordan, M. (2009). Multi-task feature selection. <i>J. Stat. Comput.</i> To appear.

[25] Zhao, P., Rocha, G. and Yu, B. (2006). Grouped and hierarchical model selection through composite absolute penalties. Technical Report 703, Dept. Statistics, UC Berkeley.