Abstract
This paper presents the publication as Linked Open Data of a set of coreference and anaphora annotations (called CorefLat) performed on a set of Latin texts. Annotations are made on texts already available as Linked Open Data as part of the LiLa Knowledge Base of interoperable linguistic resources for Latin. By adopting a lemma-centered architecture and established guidelines for annotation inspired by those of the GUM corpus, CorefLat systematically identifies and tags entities and mentions, creating relational links. The annotated corpus covers multiple periods and genres, including Augustine’s Confessiones, Plautus’ Curculio, Caesar’s De Bello Gallico, and Seneca’s Medea, ensuring a balanced dataset for broader linguistic analysis. The publication of CorefLat as Linked Open Data relies on an OWL ontology that extends the POWLA framework, thus enabling interoperability with diverse linguistic resources within LiLa. We detail how coreference relations, including phenomena such as anaphora, cataphora, split antecedents, and multiword units, are encoded through specialized classes and object properties.
| Lingua originale | Inglese |
|---|---|
| Titolo della pubblicazione ospite | SemDH 2025: Second International Workshop of Semantic Digital Humanities. Co-located with ESWC 2025, June 02, 2025, Portoroz, Slovenia. |
| Editore | CEUR-WS.org |
| Pagine | N/A-N/A |
| Numero di pagine | 14 |
| ISBN (stampa) | 1613-0073 |
| Stato di pubblicazione | Pubblicato - 2025 |
All Science Journal Classification (ASJC) codes
- Informatica Generale
Keywords
- Latin
- Linguistic Linked Open Data
- Coreference and Anaphora Resolution
- Linguistic Resources