TY - JOUR
T1 - A Machine Learning Predictive Model of Bloodstream Infection in Hospitalized Patients
AU - Murri, Rita
AU - De Angelis, Giulia
AU - Antenucci, Laura
AU - Fiori, Barbara
AU - Rinaldi, Riccardo
AU - Fantoni, Massimo
AU - Damiani, Andrea
AU - Patarnello, Stefano
AU - Sanguinetti, Maurizio
AU - Valentini, Vincenzo
AU - Posteraro, Brunella
AU - Masciocchi, Carlotta
PY - 2024
Y1 - 2024
N2 - The aim of the study was to build a machine learning-based predictive model to discriminate between hospitalized patients at low risk and high risk of bloodstream infection (BSI). A Data Mart including all patients hospitalized between January 2016 and December 2019 with suspected BSI was built. Multivariate logistic regression was applied to develop a clinically interpretable machine learning predictive model. The model was trained on 2016-2018 data and tested on 2019 data. A feature selection based on a univariate logistic regression first selected candidate predictors of BSI. A multivariate logistic regression with stepwise feature selection in five-fold cross-validation was applied to express the risk of BSI. A total of 5660 hospitalizations (4026 and 1634 in the training and the validation subsets, respectively) were included. Eleven predictors of BSI were identified. The performance of the model in terms of AUROC was 0.74. Based on the interquartile predicted risk score, 508 (31.1%) patients were defined as being at low risk, 776 (47.5%) at medium risk, and 350 (21.4%) at high risk of BSI. Of them, 14.2% (72/508), 30.8% (239/776), and 64% (224/350) had a BSI, respectively. The performance of the predictive model of BSI is promising. Computational infrastructure and machine learning models can help clinicians identify people at low risk for BSI, ultimately supporting an antibiotic stewardship approach.
AB - The aim of the study was to build a machine learning-based predictive model to discriminate between hospitalized patients at low risk and high risk of bloodstream infection (BSI). A Data Mart including all patients hospitalized between January 2016 and December 2019 with suspected BSI was built. Multivariate logistic regression was applied to develop a clinically interpretable machine learning predictive model. The model was trained on 2016-2018 data and tested on 2019 data. A feature selection based on a univariate logistic regression first selected candidate predictors of BSI. A multivariate logistic regression with stepwise feature selection in five-fold cross-validation was applied to express the risk of BSI. A total of 5660 hospitalizations (4026 and 1634 in the training and the validation subsets, respectively) were included. Eleven predictors of BSI were identified. The performance of the model in terms of AUROC was 0.74. Based on the interquartile predicted risk score, 508 (31.1%) patients were defined as being at low risk, 776 (47.5%) at medium risk, and 350 (21.4%) at high risk of BSI. Of them, 14.2% (72/508), 30.8% (239/776), and 64% (224/350) had a BSI, respectively. The performance of the predictive model of BSI is promising. Computational infrastructure and machine learning models can help clinicians identify people at low risk for BSI, ultimately supporting an antibiotic stewardship approach.
KW - bloodstream infections
KW - machine learning
KW - prediction
KW - bloodstream infections
KW - machine learning
KW - prediction
UR - http://hdl.handle.net/10807/271276
U2 - 10.3390/diagnostics14040445
DO - 10.3390/diagnostics14040445
M3 - Article
SN - 2075-4418
VL - 14
SP - 445
EP - 445
JO - Diagnostics
JF - Diagnostics
ER -