KD Strikes Back: from Keyphrases to Labelled Domains Using External Knowledge Sources

Rachele Sprugnoli, Giovanni Moretti, Sara Tonelli

Risultato della ricerca: Contributo in libroContributo a convegno

Abstract

This paper presents L-KD, a tool that relies on available linguistic and knowledge resources to perform keyphrase clustering and labelling. The aim of L-KD is to help finding and tracing themes in English and Italian text data, represented by groups of keyphrases and associated domains. We perform an evaluation of the top-ranked domains using the 20 Newsgroup dataset, and we show that 8 domains out of 10 match with manually assigned labels. This confirms the good accuracy of this approach, which does not require supervision.
Lingua originaleEnglish
Titolo della pubblicazione ospiteProceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) & Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016)
Pagine216-221
Numero di pagine6
Stato di pubblicazionePubblicato - 2016
EventoThird Italian Conference on Computational Linguistics (CLiC-it 2016) - Napoli, Italia
Durata: 5 dic 20167 dic 2016

Convegno

ConvegnoThird Italian Conference on Computational Linguistics (CLiC-it 2016)
CittàNapoli, Italia
Periodo5/12/167/12/16

Keywords

  • computational linguistics, keyphrase extraction, clustering, wordnet domains, conceptnet

Fingerprint Entra nei temi di ricerca di 'KD Strikes Back: from Keyphrases to Labelled Domains Using External Knowledge Sources'. Insieme formano una fingerprint unica.

Cita questo