TY - JOUR
T1 - Model-based clustering via new parsimonious mixtures of heavy-tailed distributions
AU - Tomarchio, Salvatore D.
AU - Bagnato, Luca
AU - Punzo, Antonio
PY - 2022
Y1 - 2022
N2 - Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.
AB - Two families of parsimonious mixture models are introduced for model-based clustering. They are based on two multivariate distributions-the shifted exponential normal and the tail-inflated normal-recently introduced in the literature as heavy-tailed generalizations of the multivariate normal. Parsimony is attained by the eigen-decomposition of the component scale matrices, as well as by the imposition of a constraint on the tailedness parameters. Identifiability conditions are also provided. Two variants of the expectation-maximization algorithm are presented for maximum likelihood parameter estimation. Parameter recovery and clustering performance are investigated via a simulation study. Comparisons with the unconstrained mixture models are obtained as by-product. A further simulated analysis is conducted to assess how sensitive our and some well-established parsimonious competitors are to their own generative scheme. Lastly, our and the competing models are evaluated in terms of fitting and clustering on three real datasets.
KW - Mixture models
KW - Model-based clustering
KW - Multivariate shifted exponential normal distribution
KW - Multivariate tail-inflated normal distribution
KW - Parsimony
KW - Mixture models
KW - Model-based clustering
KW - Multivariate shifted exponential normal distribution
KW - Multivariate tail-inflated normal distribution
KW - Parsimony
UR - http://hdl.handle.net/10807/194601
U2 - 10.1007/s10182-021-00430-8
DO - 10.1007/s10182-021-00430-8
M3 - Article
SN - 1863-8171
VL - 2022
SP - 1
EP - 33
JO - AStA Advances in Statistical Analysis
JF - AStA Advances in Statistical Analysis
ER -