Abstract
This paper presents L-KD, a tool that relies on available linguistic and knowledge resources to perform keyphrase clustering and labelling. The aim of L-KD is to help finding and tracing themes in English and Italian text data, represented by groups of keyphrases and associated domains. We perform an evaluation of the top-ranked domains using the 20 Newsgroup dataset, and we show that 8 domains out of 10 match with manually assigned labels. This confirms the good accuracy of this approach, which does not require supervision.
| Lingua originale | Inglese |
|---|---|
| Titolo della pubblicazione ospite | Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) & Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2016) |
| Pagine | 216-221 |
| Numero di pagine | 6 |
| Volume | 1749 |
| DOI | |
| Stato di pubblicazione | Pubblicato - 2016 |
| Evento | Third Italian Conference on Computational Linguistics (CLiC-it 2016) - Napoli, Italia Durata: 5 dic 2016 → 7 dic 2016 |
Convegno
| Convegno | Third Italian Conference on Computational Linguistics (CLiC-it 2016) |
|---|---|
| Città | Napoli, Italia |
| Periodo | 5/12/16 → 7/12/16 |
Keywords
- computational linguistics, keyphrase extraction, clustering, wordnet domains, conceptnet