Abstract
The growing amount of information published on the
Web, combined with its dynamic nature, opens many
challenging issues dealing with management and retrieval
of the information and provisioning of the underlying infrastructures.
Search engines have to meet two conflicting
requirements: minimize the number of downloads and provide
up-to-date information. In this paper, we present the
results of an exploratory analysis aimed at investigating
the novelty of the content of a news Web site. We analyzed
the Web site from an horizontal perspective by focusing on
the content of the individual articles and from a vertical
perspective by focusing on the entire collection of articles
published on the site. These two perspectives allowed us
to study how fast and to what extent articles were modified
and to model the evolution of the Web site.
Lingua originale | English |
---|---|
Titolo della pubblicazione ospite | Proc. SPECTS 2010 Conference |
Pagine | 399-404 |
Numero di pagine | 6 |
Stato di pubblicazione | Pubblicato - 2010 |
Evento | SPECTS 2010 Conference - Ottawa (Canada) Durata: 11 lug 2010 → 17 feb 2011 |
Workshop
Workshop | SPECTS 2010 Conference |
---|---|
Città | Ottawa (Canada) |
Periodo | 11/7/10 → 17/2/11 |
Keywords
- DYNAMIC WEB
- PERFORMANCE
- WORKLOAD