Knowledge discovery from textual sources by using semantic similarity

Paolo Atzeni, Fabio Polticelli, Daniele Toti

Risultato della ricerca: Contributo in libroContributo a convegno

2 Citazioni (Scopus)


We propose a methodology to automatically discover characterizing knowledge from textual sources, with the purpose of semantically categorizing them and clustering them together according to their subjects. Such a methodology is based upon several challenging steps, like terminology extraction and disambiguation, semantic similarity identification via ontology alignment, and a core pattern-based strategy for automatic ontology building. This methodology was originally devised as an extension of PRAISED, our abbreviation identification and resolution proposal, with the purpose of allowing us to resolve previously unresolvable abbreviations, whose explanation either escapes the system's proximity-based approach or is not found within the very source text they are featured in. By moving from a paper-by-paper, mainly syntactical process to a corpus-based, semantic approach, it will be in fact possible to dramatically enhance our system in terms of its resolution capabilities. Nevertheless, the strategy we present here is not tied to this specific task, but is instead of relevance for a variety of contexts, and might therefore find a far wider applicability for other advanced knowledge extraction and discovery systems. Copyright (c) 2012 - Edizioni Libreria Progetto and the authors.
Lingua originaleEnglish
Titolo della pubblicazione ospiteProceedings of the 20th Italian Symposium on Advanced Database Systems, SEBD 2012
Numero di pagine8
Stato di pubblicazionePubblicato - 2012
Evento20th Italian Symposium on Advanced Database Systems, SEBD 2012 - Venice, ita
Durata: 24 giu 201227 giu 2012


Convegno20th Italian Symposium on Advanced Database Systems, SEBD 2012
CittàVenice, ita


  • knowledge discovery
  • semantic similarity


Entra nei temi di ricerca di 'Knowledge discovery from textual sources by using semantic similarity'. Insieme formano una fingerprint unica.

Cita questo