We apply word hierarchical clustering techniques to collect the occurrences of the lemma forma that show a similar contextual behaviour in the works of Thomas Aquinas into the same or closely related groups. Our results will support the lexicographers of a data-driven new lexicon of Thomas Aquinas in their task of writing the lexical entry of forma. We use two datasets: the Index Thomisticus (IT), a corpus containing the opera omnia of Thomas Aquinas, and the Index Thomisticus Treebank, a syntactically annotated subset of the IT.
Results are evaluated against a manually labeled subset of the occurrences of forma.
|Title of host publication||Analysis and Modeling of Complex Data in Behavioral and Social Sciences|
|Editors||D Vicari, A Okada, G Ragozini, C Weihs|
|Number of pages||9|
|Publication status||Published - 2014|
|Name||STUDIES IN CLASSIFICATION, DATA ANALYSIS, AND KNOWLEDGE ORGANIZATION|
- divisive hierarchical clustering analysis
- index Thomisticus