TY - JOUR
T1 - Performance Prediction of Cloud-Based Big Data Applications
AU - Ardagna, Danilo
AU - Barbierato, Enrico
AU - Evangelinou, Athanasia
AU - Gianniti, Eugenio
AU - Gribaudo, Marco
AU - Pinto, Túlio B. M.
AU - Guimarães, Anna
AU - Da Silva, Ana Paula Couto
AU - Almeida, Jussara M.
PY - 2018
Y1 - 2018
N2 - Data heterogeneity and irregularity are key characteristics of big data applications that often overwhelm the existing software and hardware infrastructures. In such context, the flexibility and elasticity provided by the cloud computing paradigm offer a natural approach to cost-effectively adapting the allocated resources to the application's current needs. Yet, the same characteristics impose extra challenges to predicting the performance of cloud-based big data applications, a central step in proper management and planning. This paper explores two modeling approaches for performance prediction of cloud-based big data applications. We evaluate a queuing-based analytical model and a novel fast ad-hoc simulator in various scenarios based on different applications and infrastructure setups. Our results show that our approaches can predict average application execution times with $26 %$ relative error in the very worst case and about 12% on average. Moreover, our simulator provides performance estimates 70 times faster than state of the art simulation tools.
AB - Data heterogeneity and irregularity are key characteristics of big data applications that often overwhelm the existing software and hardware infrastructures. In such context, the flexibility and elasticity provided by the cloud computing paradigm offer a natural approach to cost-effectively adapting the allocated resources to the application's current needs. Yet, the same characteristics impose extra challenges to predicting the performance of cloud-based big data applications, a central step in proper management and planning. This paper explores two modeling approaches for performance prediction of cloud-based big data applications. We evaluate a queuing-based analytical model and a novel fast ad-hoc simulator in various scenarios based on different applications and infrastructure setups. Our results show that our approaches can predict average application execution times with $26 %$ relative error in the very worst case and about 12% on average. Moreover, our simulator provides performance estimates 70 times faster than state of the art simulation tools.
KW - Fluid models, performance
KW - Fluid models, performance
UR - http://hdl.handle.net/10807/202851
U2 - 10.1145/3184407.3184420
DO - 10.1145/3184407.3184420
M3 - Article
SN - 0920-8542
SP - 192
EP - 199
JO - THE JOURNAL OF SUPERCOMPUTING
JF - THE JOURNAL OF SUPERCOMPUTING
ER -