Abstract
The content of news websites changes frequently
and rapidly and its relevance tends to decay with time. To be
of any value to the users, tools, such as, search engines, have to
cope with these evolving websites and detect in a timely manner
their changes. In this paper we apply time series analysis to
study the properties and the temporal patterns of the change
rates of the content of three news websites. Our investigation
shows that changes are characterized by large fluctuations with
periodic patterns and time dependent behavior. The time series
describing the change rate is decomposed into trend, seasonal
and irregular components and models of each component are
then identified. The trend and seasonal components describe
the daily and weekly patterns of the change rates. Trigonometric
polynomials best fit these deterministic components,
whereas the class of ARMA models represents the irregular
component. The resulting models can be used to describe the
dynamics of the changes and predict future change rates.
Lingua originale | English |
---|---|
Titolo della pubblicazione ospite | PDCAT 2012 Parallel and Distributed Computing, Applications and Technologies |
Pagine | 529-533 |
Numero di pagine | 5 |
DOI | |
Stato di pubblicazione | Pubblicato - 2012 |
Evento | PDCAT 2012 - Beijing Durata: 14 dic 2012 → 16 dic 2012 |
Convegno
Convegno | PDCAT 2012 |
---|---|
Città | Beijing |
Periodo | 14/12/12 → 16/12/12 |
Keywords
- PDCAT
- Parallel and distributed computing