Cluster-weighted $$t$$ t -factor analyzers for robust model-based clustering and dimension reduction

Journal of the Italian Statistical Society - Tập 24 Số 4 - Trang 623-649 - 2015
Sanjeena Subedi1, Antonio Punzo2, Salvatore Ingrassia2, Paul D. McNicholas3
1Department of Mathematics and Statistics, University of Guelph, Guelph, Canada
2Department of Economics and Business, University of Catania, Catania, Italy
3Department of Mathematics and Statistics, McMaster University, Hamilton, Canada

Tóm tắt

Từ khóa


Tài liệu tham khảo

Airoldi J, Hoffmann R (1984) Age variation in voles (Microtus californicus, M. ochrogaster) and its significance for systematic studies. Occasional papers of the Museum of Natural History 111, University of Kansas, Lawrence, KS

Aitken AC (1926) On Bernoulli’s numerical solution of algebraic equations. In: Proceedings of the Royal Society of Edinburgh, vol 46, pp 289–305

Andrews JL, McNicholas PD (2011) Extending mixtures of multivariate $$t$$ t -factor analyzers. Stat Comput 21(3):361–373

Baek J, McLachlan GJ (2011) Mixtures of common $$t$$ t -factor analyzers for clustering high-dimensional microarray data. Bioinformatics 27(9):1269–1276

Baek J, McLachlan GJ, Flack LK (2010) Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data. IEEE Trans Pattern Anal Mach Intell 32(7):1298–1309

Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725

Böhning D, Dietz E, Schaub R, Schlattmann P, Lindsay B (1994) The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann Inst Stat Math 46(2):373–388

Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38

DeSarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5(2):249–282

Ehrlich I (1973) Participation in illegitimate activities: a theoretical and empirical investigation. J Polit Econ 81(3):521–565

Flury B (1997) A first course in multivariate statistics. Springer, New York

Gershenfeld N (1997) Nonlinear inference and cluster-weighted modeling. Ann NY Acad Sci 808(1):18–24

Grün B, Leisch F (2008) FlexMix version 2: finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35

Hennig C (2000) Identifiablity of models for clusterwise linear regression. J Classif 17(2):273–296

Ingrassia S, Minotti SC, Vittadini G (2012) Local statistical modeling via the cluster-weighted approach with elliptical distributions. J Classif 29(3):363–401

Ingrassia S, Minotti SC, Punzo A (2014) Model-based clustering via linear cluster-weighted models. Comput Stat Data Anal 71:159–182

Ingrassia S, Punzo A, Vittadini G (2015) The generalized linear mixed cluster-weighted model. J Classif 32 (in press)

Leisch F (2004) FlexMix: a general framework for finite mixture models and latent class regression in $${\sf R}$$ R . J Stat Softw 11(8):1–18

Lindsay BG (1995) Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. Institute of Mathematical Statistics, Hayward

Lo Y (2008) A likelihood ratio test of a homoscedastic normal mixture against a heteroscedastic normal mixture. Stat Comput 18(3):233–240

McLachlan GJ (1987) On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. J R Stat Soc Ser C (Appl Stat) 36(3):318–324

McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

McLachlan GJ, Bean RW, Ben-Tovim Jones L (2007) Extension of the mixture of factor analyzers model to incorporate the multivariate $$t$$ t -distribution. Comput Stat Data Anal 51(11):5327–5338

Meng XL, van Dyk D (1997) The EM algorithm—an old folk-song sung to a fast new tune. J R Stat Soc Ser B (Stat Methodol) 59(3):511–567

Punzo A (2014) Flexible mixture modeling with the polynomial Gaussian cluster-weighted model. Stat Model 14(3):257–291

Punzo A, Ingrassia S (2015) Parsimonious generalized linear Gaussian cluster-weighted models. In: Morlini I, Minerva T, Palumbo F (eds) Advances in statistical models for data analysis, studies in classification, data analysis and knowledge organization. Springer, Switzerland

Punzo A, Browne RP, McNicholas PD (2014) Hypothesis testing for parsimonious Gaussian mixture models. http://arxiv.org/abs/1405.0377

R Core Team (2013) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

Sakamoto Y, Ishiguro M, Kitagawa G (1983) Akaike information criterion statistics. Reidel, Boston

Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

Subedi S, Punzo A, Ingrassia S, McNicholas PD (2013) Clustering and classification via cluster-weighted factor analyzers. Adv Data Anal Classif 7(1):5–40

Vandaele W (1987) Participation in illegitimate activities: Ehrlich revisited. Report, U.S. Department of Justice, National Institute of Justice