Abstract

We present a lexical-based investigation into the corpus of the opera omnia of Seneca. By applying a number of statistical techniques to textual data we aim to automatically collect similar texts into closely related groups. We demonstrate that our objective and unsupervised method is able to distinguish the texts by work and genre.
Original languageEnglish
Title of host publicationAdvances in Latent Variables. Methods, Models and Applications
EditorsMaurizio Carpita, Eugenio Brentari, El Mostafa Qannari
Pages13-25
Number of pages13
DOIs
Publication statusPublished - 2014

Publication series

NameSTUDIES IN THEORETICAL AND APPLIED STATISTICS

Keywords

  • Clustering
  • Latin

Cite this