Abstract
Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.
| Lingua originale | Inglese |
|---|---|
| pagine (da-a) | 1-33 |
| Numero di pagine | 33 |
| Rivista | AStA Advances in Statistical Analysis |
| Volume | 2022 |
| Numero di pubblicazione | 1 |
| DOI | |
| Stato di pubblicazione | Pubblicato - 2022 |
All Science Journal Classification (ASJC) codes
- Analisi
- Statistica e Probabilità
- Modellazione e Simulazione
- Scienze Sociali (varie)
- Economia ed Econometria
- Matematica Applicata
Keywords
- Mixture models
- Model-based clustering
- Multivariate shifted exponential normal distribution
- Multivariate tail-inflated normal distribution
- Parsimony