Abstract
This paper presents L-KD, a tool that relies on available linguistic and knowledge resources to perform keyphrase clustering and labelling. The aim of L-KD is to help finding and tracing themes in English and Italian text data, represented by groups of keyphrases and associated domains. We perform an evaluation of the top-ranked domains using the 20 Newsgroup dataset, and we show that 8 domains out of 10 match with manually assigned labels. This confirms the good accuracy of this approach, which does not require supervision.
Lingua originale | English |
---|---|
Titolo della pubblicazione ospite | Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) & Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016) |
Pagine | 216-221 |
Numero di pagine | 6 |
Volume | 1749 |
DOI | |
Stato di pubblicazione | Pubblicato - 2016 |
Evento | Third Italian Conference on Computational Linguistics (CLiC-it 2016) - Napoli, Italia Durata: 5 dic 2016 → 7 dic 2016 |
Convegno
Convegno | Third Italian Conference on Computational Linguistics (CLiC-it 2016) |
---|---|
Città | Napoli, Italia |
Periodo | 5/12/16 → 7/12/16 |
Keywords
- computational linguistics, keyphrase extraction, clustering, wordnet domains, conceptnet