Abstract
Web content changes have a strong impact on
search engines and more generally on technologies dealing with
content retrieval and management. These technologies have to
take account of the temporal patterns of these changes and
adjust their crawling policies accordingly. This paper presents
a methodological framework – based on time series analysis –
for modeling and predicting the dynamics of the content changes.
To test this framework, we analyze the content of three major
news websites whose change patterns are characterized by large
fluctuations and significant differences across days and hours.
The classical decomposition of the observed time series into trend,
seasonal and irregular components is applied to identify the
weekly and daily patterns as well as the remaining fluctuations.
The corresponding models are used for predicting the future
dynamics of the sites based on their current and historical
behavior.
Lingua originale | English |
---|---|
Titolo della pubblicazione ospite | 2018 32nd International Conference on Advanced Information Networking and Applications Workshops |
Pagine | 12-17 |
Numero di pagine | 6 |
DOI | |
Stato di pubblicazione | Pubblicato - 2018 |
Evento | 2018 32nd International Conference on Advanced Information Networking and Applications Workshops - Kraków Durata: 16 mag 2018 → 18 mag 2018 |
Convegno
Convegno | 2018 32nd International Conference on Advanced Information Networking and Applications Workshops |
---|---|
Città | Kraków |
Periodo | 16/5/18 → 18/5/18 |
Keywords
- Forecasting
- web content dynamics