Abstract
Clustering methods have typically found their application when dealing with continuous data. However, in many modern applications data consist of multiple
categorical variables with no natural ordering. In the heuristic framework the problem of clustering these data is tackled by introducing suitable distances. In this
work, we develop a model-based approach for clustering categorical data with nominal scale. Our approach is based on a mixture of distributions defined via the Hamming
distance between categorical vectors. Maximum likelihood inference is delivered through an expectation-maximization algorithm. A simulation study is carried
out to illustrate the proposed approach.
Lingua originale | English |
---|---|
Titolo della pubblicazione ospite | Book of short papers SIS 2021 |
Pagine | 752-757 |
Numero di pagine | 6 |
Stato di pubblicazione | Pubblicato - 2021 |
Evento | SIS 2021 - Pisa Durata: 21 giu 2021 → 25 giu 2021 |
Convegno
Convegno | SIS 2021 |
---|---|
Città | Pisa |
Periodo | 21/6/21 → 25/6/21 |
Keywords
- Expectation-Maximization algorithm
- Hamming distribution
- mixture modeling
- nominal data