TY - JOUR
T1 - Characterization of the Evolution of a News Web Site
AU - Calzarossa, Maria Carla
AU - Tessera, Daniele
AU - Calzarossa, Maria
PY - 2008
Y1 - 2008
N2 - The Web has become a ubiquitous tool for distributing knowledge and information and for conducting
businesses. To exploit the huge potential of the Web as a global information repository, it is necessary
to understand its dynamics. These issues are particularly important for news Web sites as they are
expected to provide fresh information on current world events to a potentially large user population. This
paper presents an experimental study aimed at characterizing and modeling the evolution of a news Web
site. We focused on the MSNBC Web site as it is a good representative of its category in terms of structure,
news coverage and popularity. Specifically, we analyzed how often and to what extent the content of this
site changed and we identified models describing its dynamics. The study has shown that the rate of page
creations and updates was characterized by some well defined patterns that varied as a function of time
of day and day of week. On the contrary, the content of individual pages changed to a different extent.
Most updates involved a very small fraction of their content, whereas very few were more extensive
and spread over the whole page. By taking into accounts all these aspects, we derived analytical models
able to accurately capture and reproduce the evolution of the news Web site.
AB - The Web has become a ubiquitous tool for distributing knowledge and information and for conducting
businesses. To exploit the huge potential of the Web as a global information repository, it is necessary
to understand its dynamics. These issues are particularly important for news Web sites as they are
expected to provide fresh information on current world events to a potentially large user population. This
paper presents an experimental study aimed at characterizing and modeling the evolution of a news Web
site. We focused on the MSNBC Web site as it is a good representative of its category in terms of structure,
news coverage and popularity. Specifically, we analyzed how often and to what extent the content of this
site changed and we identified models describing its dynamics. The study has shown that the rate of page
creations and updates was characterized by some well defined patterns that varied as a function of time
of day and day of week. On the contrary, the content of individual pages changed to a different extent.
Most updates involved a very small fraction of their content, whereas very few were more extensive
and spread over the whole page. By taking into accounts all these aspects, we derived analytical models
able to accurately capture and reproduce the evolution of the news Web site.
KW - DYNAMIC WEB CONTENTS
KW - PERFORMANCE ANALYSIS
KW - WORKLOAD CHARACTERIZATION
KW - DYNAMIC WEB CONTENTS
KW - PERFORMANCE ANALYSIS
KW - WORKLOAD CHARACTERIZATION
UR - http://hdl.handle.net/10807/30141
U2 - doi:10.1016/j.jss.2008.04.038
DO - doi:10.1016/j.jss.2008.04.038
M3 - Article
SN - 0164-1212
SP - 2336
EP - 2344
JO - Journal of Systems and Software
JF - Journal of Systems and Software
ER -