Corpus linguistics is nowadays a well established field of research, where collaborative work with both computational and theoretical linguistics is required. As a matter of fact, computational linguistics makes use of corpus data to train probabilistic Natural Language Processing (NLP) tools, such as taggers and parsers; on the other hand, in empirical approaches to the study of language, theoretical linguistics refers to corpus evidence. On its side, corpus linguistics, as a discipline in itself, uses NLP tools to (semi)automatically build annotated corpora, and refers to linguistic theory as the backbone for the design of annotation guidelines. The creation of a linguistically annotated corpus is, therefore, an excellent opportunity to apply to real data (and potentially revise) linguistic theories which have been designed in a pre-corpus era. This is an even more attractive challenge if a language like Latin is involved. Indeed, while the language-dependent computational processing of Latin is today limited to automatic morphological tagging, a number of available language-independent methods and tools of analysis can be applied to it.
|Numero di pagine||19|
|Stato di pubblicazione||Pubblicato - 2009|