High-dimensional Ising model selection using ℓ1-regularized logistic regression
Tóm tắt
Từ khóa
Tài liệu tham khảo
[15] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. <i>J. Amer. Statist. Assoc.</i> <b>58</b> 13–30.
[1] Abbeel, P., Koller, D. and Ng, A. Y. (2006). Learning factor graphs in polynomial time and sample complexity. <i>J. Mach. Learn. Res.</i> <b>7</b> 1743–1788.
[2] Banerjee, O., Ghaoui, L. E. and d’Asprémont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. <i>J. Mach. Learn. Res.</i> <b>9</b> 485–516.
[5] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when <i>p</i> is much larger than <i>n</i> (with discussion). <i>Ann. Statist.</i> <b>35</b> 2313–2351.
[6] Chickering, D. (1995). Learning Bayesian networks is NP-complete. In <i>Learning from Data: Artificial Intelligence and Statistics V</i> (D. Fisher and H. Lenz, eds.). <i>Lecture Notes in Statistics</i> <b>112</b> 121–130. Springer, New York.
[7] Chow, C. and Liu, C. (1968). Approximating discrete probability distributions with dependence trees. <i>IEEE Trans. Inform. Theory</i> <b>14</b> 462–467.
[8] Cross, G. and Jain, A. (1983). Markov random field texture models. <i>IEEE Trans. PAMI</i> <b>5</b> 25–39.
[9] Csiszár, I. and Talata, Z. (2006). Consistent estimation of the basic neighborhood structure of Markov random fields. <i>Ann. Statist.</i> <b>34</b> 123–145.
[11] Davidson, K. R. and Szarek, S. J. (2001). Local operator theory, random matrices, and Banach spaces. In <i>Handbook of the Geometry of Banach Spaces</i> <b>1</b> 317–336. Elsevier, Amsterdam.
[12] Donoho, D. and Elad, M. (2003). Maximal sparsity representation via <i>ℓ</i><sub>1</sub> minimization. <i>Proc. Natl. Acad. Sci. USA</i> <b>100</b> 2197–2202.
[13] Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. <i>IEEE Trans. PAMI</i> <b>6</b> 721–741.
[14] Hassner, M. and Sklansky, J. (1980). The use of Markov random fields as models of texture. <i>Comp. Graphics Image Proc.</i> <b>12</b> 357–370.
[17] Ising, E. (1925). Beitrag zur theorie der ferromagnetismus. <i>Zeitschrift für Physik</i> <b>31</b> 253–258.
[18] Kalisch, M. and Buhlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the pc-algorithm. <i>J. Mach. Learn. Res.</i> <b>8</b> 613–636.
[19] Kim, Y., Kim, J. and Kim, Y. (2005). Blockwise sparse regression. <i>Statist. Sinica</i> <b>16</b> 375–390.
[20] Koh, K., Kim, S. J. and Boyd, S. (2007). An interior-point method for large-scale <i>ℓ</i><sub>1</sub>-regularized logistic regression. <i>J. Mach. Learn. Res.</i> <b>3</b> 1519–1555.
[23] Meinshausen, N. and Bühlmann, P. (2006). High dimensional graphs and variable selection with the lasso. <i>Ann. Statist.</i> <b>34</b> 1436–1462.
[28] Rothman, A., Bickel, P., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. <i>Electron. J. Stat.</i> <b>2</b> 494–515.
[31] Srebro, N. (2003). Maximum likelihood bounded tree-width Markov networks. <i>Artificial Intelligence</i> <b>143</b> 123–138.
[32] Tropp, J. A. (2006). Just relax: Convex programming methods for identifying sparse signals. <i>IEEE Trans. Inform. Theory</i> <b>51</b> 1030–1051.
[33] Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using <i>ℓ</i><sub>1</sub>-constrained quadratic programming (Lasso). <i>IEEE Trans. Inform. Theory</i> <b>55</b> 2183–2202.
[37] Woods, J. (1978). Markov image modeling. <i>IEEE Trans. Automat. Control</i> <b>23</b> 846–850.
[38] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. <i>J. R. Stat. Soc. Ser. B Stat. Methodol.</i> <b>68</b> 49–67.
[39] Zhao, P. and Yu, B. (2007). On model selection consistency of lasso. <i>J. Mach. Learn. Res.</i> <b>7</b> 2541–2567.
[3] Bertsekas, D. (1995). <i>Nonlinear Programming</i>. Athena Scientific, Belmont, MA.
[4] Bresler, G., Mossel, E. and Sly, A. (2009). Reconstruction of Markov random fields from samples: Some easy observations and algorithms. Available at http://front.math.ucdavis.edu/0712.1402.
[10] Dasgupta, S. (1999). Learning polytrees. In <i>Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI-99)</i>. Morgan Kaufmann, San Francisco, CA.
[16] Horn, R. A. and Johnson, C. R. (1985). <i>Matrix Analysis</i>. Cambridge Univ. Press, Cambridge.
[21] Manning, C. D. and Schutze, H. (1999). <i>Foundations of Statistical Natural Language Processing</i>. MIT Press, Cambridge, MA.
[22] Meier, L., van de Geer, S. and Bühlmann, P. (2007). The group lasso for logistic regression. Technical report, Mathematics Dept., Swiss Federal Institute of Technology Zürich.
[24] Ng, A. Y. (2004). Feature selection, <i>ℓ</i><sub>1</sub> vs. <i>ℓ</i><sub>2</sub> regularization, and rotational invariance. In <i>Proceedings of the Twenty-First International Conference on Machine Learning (ICML-04)</i>. Morgan Kaufmann, San Francisco, CA.
[25] Obozinski, G., Wainwright, M. J. and Jordan, M. I. (2008). Union support recovery in high-dimensional multivariate regression. Technical report, Dept. Statistics, Univ. California, Berkeley.
[26] Ripley, B. D. (1981). <i>Spatial Statistics</i>. Wiley, New York.
[27] Rockafellar, G. (1970). <i>Convex Analysis</i>. Princeton Univ. Press, Princeton.
[29] Santhanam, N. P. and Wainwright, M. J. (2008). Information-theoretic limits of high-dimensional graphical model selection. In <i>International Symposium on Information Theory</i>. Toronto, Canada.
[30] Spirtes, P., Glymour, C. and Scheines, R. (2000). <i>Causation, Prediction and Search</i>. MIT Press, Cambridge, MA.
[34] Wainwright, M. J. and Jordan, M. I. (2003). Graphical models, exponential families, and variational inference. Technical Report 649, Dept. Statistics, Univ. California, Berkeley.
[35] Wainwright, M. J., Ravikumar, P. and Lafferty, J. D. (2007). High-dimensional graphical model selection using <i>ℓ</i><sub>1</sub>-regularized logistic regression. In <i>Advances in Neural Information Processing Systems</i> (B. Schölkopf, J. Platt and T. Hoffman, eds.) <b>19</b> 1465–1472. MIT Press, Cambridge, MA.
[36] Welsh, D. J. A. (1993). <i>Complexity: Knots, Colourings, and Counting</i>. Cambridge Univ. Press, Cambridge.