The composite absolute penalties family for grouped and hierarchical variable selection
Tóm tắt
Từ khóa
Tài liệu tham khảo
[22] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. <i>J. Roy. Statist. Soc. Ser. B</i> <b>58</b> 267–288.
[3] Breiman, L. (1995). Better subset regression using the nonnegative garrote. <i>Technometrics</i> <b>37</b> 373–384.
[4] Chen, S., Donoho, D. and Saunders, M. (2001). Atomic decomposition by basis pursuit. <i>SIAM Rev.</i> <b>43</b> 129–159.
[5] Donoho, D. and Johnstone, I. (1994). Ideal spatial adaptation by wavelet shrinkage. <i>Biometrika</i> <b>81</b> 425–455.
[7] Efron, B. (2004). The estimation of prediction error covariance penalties and cross-validation. <i>J. Amer. Statist. Assoc.</i> <b>99</b> 619–632.
[8] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. <i>Ann. Statist.</i> <b>35</b> 407–499.
[9] Frank, I. E. and Friedman, J. (1993). A statistical view of some chemometrics regression tools. <i>Technometrics</i> <b>35</b> 109–148.
[10] Freund, Y. and Schapire, R. E. (1997). A decision theoretic generalization of online learning and an application to boosting. <i>J. Comput. System Sci.</i> <b>55</b> 119–139.
[11] Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D. and Lander, E. S. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. <i>Science</i> <b>286</b> 531–537.
[12] Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation of nonorthogonal problems. <i>Technometrics</i> <b>12</b> 55–67.
[14] Kim, Y., Kim, J. and Kim, Y. (2006). Blockwise sparse regression. <i>Statist. Sinica</i> <b>16</b> 375–390.
[15] Mallows, C. L. (1973). Some comments on <i>c</i><sub><i>p</i></sub>. <i>Technometrics</i> <b>15</b> 661–675.
[17] Osborne, M., Presnell, B. and Turlach, B. (2000). A new approach to variable selection in least square problems. <i>IMA J. Numer. Anal.</i> <b>20</b> 389–404.
[18] Rosset, S. and Zhu, J. (2007). Piecewise linear regularized solution paths. <i>Ann. Statist.</i> <b>35</b> 1012–1030.
[19] Schwartz, G. (1978). Estimating the dimension of a model. <i>Ann. Statist.</i> <b>6</b> 461–464.
[20] Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. <i>J. Roy. Statist. Soc. Ser. B Methodol.</i> <b>36</b> 111–147.
[21] Sugiura, N. (1978). Further analysis of the data by Akaike’s information criterion and finite corrections. <i>Comm. Statist.</i> <b>A7</b> 13–26.
[23] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. <i>J. Roy. Statist. Soc. Ser. B</i> <b>68</b> 49–67.
[24] Zhao, P. and Yu, B. (2007). Stagewise Lasso. <i>J. Mach. Learn. Res.</i> <b>8</b> 2701–2726.
[26] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. <i>J. Roy. Statist. Soc. Ser. B</i> <b>67</b> 301–320.
[27] Zou, H., Hastie, T. and Tibshirani, R. (2007). On the “degrees of freedom” of the Lasso. <i>Ann. Statist.</i> <b>35</b> 2173–2192.
[1] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In <i>Proc. 2nd International Symposium on Information Theory</i> 267–281.
[2] Boyd, S. and Vandenberghe, L. (2004). <i>Convex Optimization</i>. Cambridge Univ. Press, Cambridge.
[6] Efron, B. (1982). <i>The Jackknife, the Bootstrap and Other Resampling Plans</i>. SIAM, Philadelphia.
[13] Kaufman, L. and Rousseeuw, P. J. (1990). <i>Finding Groups in Data: An Introduction to Cluster Analysis</i>. Wiley, New York.
[16] Obozinski, G. and Jordan, M. (2009). Multi-task feature selection. <i>J. Stat. Comput.</i> To appear.
[25] Zhao, P., Rocha, G. and Yu, B. (2006). Grouped and hierarchical model selection through composite absolute penalties. Technical Report 703, Dept. Statistics, UC Berkeley.