Bayesian sparse convex clustering via global-local shrinkage priors
Tóm tắt
Sparse convex clustering is to group observations and conduct variable selection simultaneously in the framework of convex clustering. Although a weighted
$$L_1$$
norm is usually employed for the regularization term in sparse convex clustering, its use increases the dependence on the data and reduces the estimation accuracy if the sample size is not sufficient. To tackle these problems, this paper proposes a Bayesian sparse convex clustering method based on the ideas of Bayesian lasso and global-local shrinkage priors. We introduce Gibbs sampling algorithms for our method using scale mixtures of normal distributions. The effectiveness of the proposed methods is shown in simulation studies and a real data analysis.
Tài liệu tham khảo
Andrews DF, Mallows CL (1974) Scale mixtures of normal distributions. J R Stat Soc B 36(1):99–102
Bhadra A, Datta J, Polson NG, Willard B (2019) Lasso meets horseshoe: a survey. Stat Sci 34(3):405–427
Bhadra A, Datta J, Polson NG, Willard B et al (2017) The horseshoe+ estimator of ultra-sparse signals. Bayesian Analysis 12(4):1105–1131
Bhattacharya A, Pati D, Pillai NS, Dunson DB (2015) Dirichlet–Laplace priors for optimal shrinkage. J Am Stat Assoc 110(512):1479–1490
Brown PJ, Griffin JE (2010) Inference with normal-gamma prior distributions in regression problems. Bayesian Anal 5(1):171–188
Cadonna A, Frühwirth-Schnatter S, Knaus P (2020) Triple the gamma—a unifying shrinkage prior for variance and variable selection in sparse state space and TVP models. Econometrics 8(2):20
Carvalho CM, Polson NG, Scott JG (2010) The horseshoe estimator for sparse signals. Biometrika 97(2):465–480
Chandra NK, Canale A, Dunson DB (2020) Bayesian clustering of high-dimensional data. arXiv preprint arXiv:2006.02700
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499
Frühwirth-Schnatter S, Malsiner-Walli G (2019) From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering. Adv Data Anal Classif 13(1):33–64
Griffin JE, Brown PJ (2005) Alternative prior distributions for variable selection with very many more variables than observations. University of Kent Technical Report
Griffin JE, Brown PJ (2011) Bayesian hyper-lassos with non-convex penalization. Aust N Z J Stat 53(4):423–442
Hartigan JA, Wong MA (1979) A k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108
Hocking TD, Joulin A, Bach F, Vert J-P (2011) Clusterpath : an algorithm for clustering using convex fusion penalties. In: Proceedings of the 28th international conference on machine learning (ICML)
Johndrow JE, Orenstein P, Bhattacharya A (2020) Scalable approximate MCMC algorithms for the horseshoe prior. J Mach Learn Res 21(73):1–61
Lichman M (2013) UCI machine learning repository, 2013. http://archive.ics.uci.edu/ml
Makalic E, Schmidt DF (2015) A simple sampler for the horseshoe estimator. IEEE Signal Process Lett 23(1):179–182
Malsiner-Walli G, Frühwirth-Schnatter S, Grün B (2016) Model-based clustering based on sparse finite Gaussian mixtures. Stat Comput 26(1–2):303–324
McLachlan GJ, Lee SX, Rathnayake SI (2019) Finite mixture models. Annu Rev Stat Appl 6:355–378
Park T, Casella G (2008) The Bayesian Lasso. J Am Stat Assoc 103:681–686
Piironen J, Vehtari A et al (2017) Sparsity information and regularization in the horseshoe and other shrinkage priors. Electron J Stat 11(2):5018–5051
Polson NG, Scott JG (2010) Shrink globally, act locally: Sparse Bayesian regularization and prediction. Bayesian Stat 9:501–538
Ray K, Szabó B (2020) Variational Bayes for high-dimensional linear regression with sparse priors. J Am Stat Assoc 1–31
Rigon T, Herring AH, Dunson DB (2020) A generalized Bayes framework for probabilistic clustering. arXiv preprint arXiv:2006.05451
Shimamura K, Ueki M, Kawano S, Konishi S (2019) Bayesian generalized fused lasso modeling via neg distribution. Commun Stat Theory Methods 48(16):4132–4153
Van Erp S, Oberski DL, Mulder J (2019) Shrinkage priors for Bayesian penalized regression. J Math Psychol 89:31–50
Wade S, Ghahramani Z et al (2018) Bayesian cluster analysis: point estimation and credible balls (with discussion). Bayesian Anal 13(2):559–626
Wang B, Zhang Y, Sun WW, Fang Y (2018) Sparse convex clustering. J Comput Graph Stat 27(2):393–403
Wang Y, Blei DM (2019) Frequentist consistency of variational Bayes. J Am Stat Assoc 114(527):1147–1161
Xu X, Ghosh M (2015) Bayesian variable selection and estimation for group lasso. Bayesian Anal 10(4):909–936
Yau C, Holmes C (2011) Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination. Bayesian Anal (Online) 6(2):329
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc B 68(1):49–67
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429