More Data and New Tools. Advances in Parsing the Index Thomisticus Treebank

Federica Gamba, Marco Carlo Passarotti

Risultato della ricerca: Contributo in libroContributo a convegno

Abstract

This paper investigates the recent advances in parsing the Index Thomisticus Treebank, which encompasses Medieval Latin texts by Thomas Aquinas. The research focuses on two types of variables. On the one hand, it examines the impact that a larger dataset has on the results of parsing; on the other hand, performances of new parsers are analysed with respect to less recent tools. Term of comparison to determine the effective parsing advances are the results in parsing the Index Thomisticus Treebank described in a previous work. First, the best performing parser among those concerned in that study is tested on a larger dataset than the one originally used. Then, some parser combinations that were developed in the same study are evaluated as well, assessing that more training data result in more accurate performances. Finally, to examine the impact that newly available tools have on parsing results, we train, test, and evaluate two neural parsers chosen among those best performing in the CoNLL 2018 Shared Task. Our experiments reach the highest accuracy rates achieved so far in automatic syntactic parsing of the Index Thomisticus Treebank and of Latin overall.
Lingua originaleEnglish
Titolo della pubblicazione ospiteProceedings of the Conference on Computational Humanities Research 2021. Amsterdam, the Netherlands, November 17-19, 2021, CEUR Workshop Proceedings, 2021
Pagine108-122
Numero di pagine15
Stato di pubblicazionePubblicato - 2021
EventoConference on Computational Humanities Research 2021 - AMSTERDAM -- NLD
Durata: 17 nov 202121 nov 2021

Convegno

ConvegnoConference on Computational Humanities Research 2021
CittàAMSTERDAM -- NLD
Periodo17/11/2121/11/21

Keywords

  • Index Thomisticus Treebank
  • Latin
  • Natural Language Processing
  • Parsing

Fingerprint

Entra nei temi di ricerca di 'More Data and New Tools. Advances in Parsing the Index Thomisticus Treebank'. Insieme formano una fingerprint unica.

Cita questo