The content of news websites changes frequently and rapidly and its relevance tends to decay with time. To be of any value to the users, tools, such as, search engines, have to cope with these evolving websites and detect in a timely manner their changes. In this paper we apply time series analysis to study the properties and the temporal patterns of the change rates of the content of three news websites. Our investigation shows that changes are characterized by large fluctuations with periodic patterns and time dependent behavior. The time series describing the change rate is decomposed into trend, seasonal and irregular components and models of each component are then identified. The trend and seasonal components describe the daily and weekly patterns of the change rates. Trigonometric polynomials best fit these deterministic components, whereas the class of ARMA models represents the irregular component. The resulting models can be used to describe the dynamics of the changes and predict future change rates.
|Titolo della pubblicazione ospite||PDCAT 2012 Parallel and Distributed Computing, Applications and Technologies|
|Numero di pagine||5|
|Stato di pubblicazione||Pubblicato - 2012|
|Evento||PDCAT 2012 - Beijing|
Durata: 14 dic 2012 → 16 dic 2012
|Periodo||14/12/12 → 16/12/12|
- Parallel and distributed computing