From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

Advances in Data Analysis and Classification - Tập 13 - Trang 33-64 - 2018

Sylvia Frühwirth-Schnatter¹, Gertraud Malsiner-Walli¹

¹Institute for Statistics and Mathematics, Vienna University of Economics and Business (WU), Vienna, Austria

Tóm tắt

In model-based clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al. (Stat Comput 26:303–324, 2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with K components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than K with high probability. The number of clusters is then inferred a posteriori from the data. The present paper makes the following contributions in the context of sparse finite mixture modelling. First, it is illustrated that the concept of sparse finite mixture is very generic and easily extended to cluster various types of non-Gaussian data, in particular discrete data and continuous multivariate data arising from non-Gaussian clusters. Second, sparse finite mixtures are compared to Dirichlet process mixtures with respect to their ability to identify the number of clusters. For both model classes, a random hyper prior is considered for the parameters determining the weight distribution. By suitable matching of these priors, it is shown that the choice of this hyper prior is far more influential on the cluster solution than whether a sparse finite mixture or a Dirichlet process mixture is taken into consideration.

Tài liệu tham khảo

Aitkin M (1996) A general maximum likelihood analysis of overdispersion in generalized linear models. Stat Comput 6:251–262 Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171–178 Azzalini A (1986) Further results on a class of distributions which includes the normal ones. Statistica 46:199–208 Azzalini A, Capitanio A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J R Stat Soc Ser B 65:367–389 Azzalini A, Dalla Valle A (1996) The multivariate skew normal distribution. Biometrika 83:715–726 Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49:803–821 Bennett DA, Schneider JA, Buchman AS, de Leon CM, Bienias JL, Wilson RS (2005) The rush memory and aging project: study design and baseline characteristics of the study cohort. Neuroepidemiology 25:163–175 Bensmail H, Celeux G, Raftery AE, Robert CP (1997) Inference in model-based cluster analysis. Stat Comput 7:1–10 Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22:719–725 Celeux G, Forbes F, Robert CP, Titterington DM (2006) Deviance information criteria for missing data models. Bayesian Anal 1:651–674 Celeux G, Frühwirth-Schnatter S, Robert CP (2018) Model selection for mixture models—perspectives and strategies. In: Frühwirth-Schnatter S, Celeux G, Robert CP (eds) Handbook of mixture analysis, chapter 7. CRC Press, Boca Raton, pp 121–160 Clogg CC, Goodman LA (1984) Latent structure analysis of a set of multidimensional contincency tables. J Am Stat Assoc 79:762–771 Dellaportas P, Papageorgiou I (2006) Multivariate mixtures of normals with unknown number of components. Stat Comput 16:57–68 Escobar MD, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90:577–588 Escobar MD, West M (1998) Computing nonparametric hierarchical models. In: Dey D, Müller P, Sinha D (eds) Practical nonparametric and semiparametric Bayesian statistics, number 133 in lecture notes in statistics. Springer, Berlin, pp 1–22 Fall MD, Barat É (2014) Gibbs sampling methods for Pitman-Yor mixture models. Working paper https://hal.archives-ouvertes.fr/hal-00740770/file/Fall-Barat.pdf Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1:209–230 Ferguson TS (1974) Prior distributions on spaces of probability measures. Ann Stat 2:615–629 Ferguson TS (1983) Bayesian density estimation by mixtures of normal distributions. In: Rizvi MH, Rustagi JS (eds) Recent advances in statistics: papers in honor of Herman Chernov on his sixtieth birthday. Academic Press, New York, pp 287–302 Frühwirth-Schnatter S (2004) Estimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniques. Econom J 7:143–167 Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, New York Frühwirth-Schnatter S (2011a) Dealing with label switching under model uncertainty. In: Mengersen K, Robert CP, Titterington D (eds) Mixture estimation and applications, chapter 10. Wiley, Chichester, pp 213–239 Frühwirth-Schnatter S (2011b) Label switching under model uncertainty. In: Mengersen K, Robert CP, Titterington D (eds) Mixtures: estimation and application. Wiley, Hoboken, pp 213–239 Frühwirth-Schnatter S, Pyne S (2010) Bayesian inference for finite mixtures of univariate and multivariate skew normal and skew-t distributions. Biostatistics 11:317–336 Frühwirth-Schnatter S, Wagner H (2008) Marginal likelihoods for non-Gaussian models using auxiliary mixture sampling. Comput Stat Data Anal 52:4608–4624 Frühwirth-Schnatter S, Frühwirth R, Held L, Rue H (2009) Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data. Stat Comput 19:479–492 Frühwirth-Schnatter S, Celeux G, Robert CP (eds) (2018) Handbook of mixture analysis. CRC Press, Boca Raton Goodman LA (1974) Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61:215–231 Green PJ, Richardson S (2001) Modelling heterogeneity with and without the Dirichlet process. Scand J Stat 28:355–375 Grün B (2018) Model-based clustering. In: Frühwirth-Schnatter S, Celeux G, Robert CP (eds) Handbook of mixture analysis, chapter 8. CRC Press, Boca Raton, pp 163–198 Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218 Ishwaran H, James LF (2001) Gibbs sampling methods for stick-breaking priors. J Am Stat Assoc 96:161–173 Kalli M, Griffin JE, Walker SG (2011) Slice sampling mixture models. Stat Comput 21:93–105 Keribin C (2000) Consistent estimation of the order of mixture models. Sankhyā A 62:49–66 Lau JW, Green P (2007) Bayesian model-based clustering procedures. J Comput Graph Stat 16:526–558 Lazarsfeld PF, Henry NW (1968) Latent structure analysis. Houghton Mifflin, New York Lee S, McLachlan GJ (2013) Model-based clustering and classification with non-normal mixture distributions. Stat Methods Appl 22:427–454 Linzer DA, Lewis JB (2011) polca: an R package for polytomous variable latent class analysis. J Stat Softw 42(10):1–29 Malsiner Walli G, Frühwirth-Schnatter S, Grün B (2016) Model-based clustering based on sparse finite Gaussian mixtures. Stat Comput 26:303–324 Malsiner Walli G, Frühwirth-Schnatter S, Grün B (2017) Identifying mixtures of mixtures using Bayesian estimation. J Comput Graph Stat 26:285–295 Malsiner-Walli G, Pauger D, Wagner H (2018) Effect fusion using model-based clustering. Stat Model 18:175–196 McLachlan GJ, Peel D (2000) Finite mixture models. Wiley series in probability and statistics. Wiley, New York Medvedovic M, Yeung KY, Bumgarner RE (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20:1222–1232 Miller JW, Harrison MT (2013) A simple example of Dirichlet process mixture inconsistency for the number of components. In: Advances in neural information processing systems, pp 199–206 Miller JW, Harrison MT (2018) Mixture models with a prior on the number of components. J Am Stat Assoc 113:340–356 Müller P, Mitra R (2013) Bayesian nonparametric inference—why and how. Bayesian Anal 8:269–360 Nobile A (2004) On the posterior distribution of the number of components in a finite mixture. Ann Stat 32:2044–2073 Papaspiliopoulos O, Roberts G (2008) Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95:169–186 Polson NG, Scott JG, Windle J (2013) Bayesian inference for logistic models using Pólya-Gamma latent variables. J Am Stat Assoc 108:1339–49 Quintana FA, Iglesias PL (2003) Bayesian clustering and product partition models. J R Stat Soc Ser B 65:557–574 Richardson S, Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components. J R Stat Soc Ser B 59:731–792 Rousseau J, Mengersen K (2011) Asymptotic behaviour of the posterior distribution in overfitted mixture models. J R Stat Soc Ser B 73:689–710 Sethuraman J (1994) A constructive definition of Dirichlet priors. Stat Sin 4:639–650 Stern H, Arcus D, Kagan J, Rubin DB, Snidman N (1994) Statistical choices in infant temperament research. Behaviormetrika 21:1–17 van Havre Z, White N, Rousseau J, Mengersen K (2015) Overfitting Bayesian mixture models with an unknown number of components. PLoS ONE 10(7):e0131739, 1–27 Viallefont V, Richardson S, Green PJ (2002) Bayesian analysis of Poisson mixtures. J Nonparametr Stat 14:181–202

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA