TY - JOUR
T1 - Natural language processing and String Metric-assisted Assessment of Semantic Heterogeneity method for capturing and standardizing unstructured nursing activities in a hospital setting: a retrospective study
AU - Vanalli, M.
AU - Cesare, M.
AU - Cesare, Manuele
AU - Cocchieri, Antonello
AU - D’Agostino, F.
PY - 2022
Y1 - 2022
N2 - Background: Nurses record data in electronic health records (EHRs) using different terminologies and coding systems. The purpose of this study was to identify unstructured free-text nursing activities recorded by nurses in EHRs with natural language processing (NLP) techniques and to map these nursing activities into standard nursing activities using the SMASH method.
Study design: A retrospective study using NLP techniques with a unidirectional mapping strategy called SMASH.
Methods: The unstructured free-text nursing activities recorded in the Medicine, Neurology and Gastroenterology inpatient units of the Agostino Gemelli IRCCS University Hospital Foundation, Rome, Italy were collected for 6 months in 2018. Data were analyzed by three phases: a) text summarization component with NLP techniques, b) a consensus analysis by four experts to detect the category of word stems, and c) cross-mapping with SMASH. The SMASH method calculated the string comparison, similarity and distance of words through the Levenshtein distance (LD), Jaro-Winker distance and the cross-mapping's cut-offs: map [0.80-1.00] with < 13 LD, partial-map [0.50-0.79] with <13 LD and no map [0.0-0.49] with >13 LD.
Results: During the study period, 491 patient records were assessed. 548 different unstructured free-text nursing activities were recorded by nurses. 451 unstructured free-text nursing activities (82.3%) were mapped to standard PAI nursing activities, 47 (8.7%) were partial mapped, while 50 (9.0%) were not mapped. This automated mapping yielded recall of 0.95%, precision of 0.94%, accuracy of 0.91%, F-measure of 0.96. The F-measure indicates good reliability of this automated procedure in cross-mapping.
Conclusions: Lexical similarities between unstructured free-text nursing activities and standard nursing activities were found, NLP with the SMASH method is a feasible approach to extract data related to nursing concepts that are not recorded through structured data entry.
AB - Background: Nurses record data in electronic health records (EHRs) using different terminologies and coding systems. The purpose of this study was to identify unstructured free-text nursing activities recorded by nurses in EHRs with natural language processing (NLP) techniques and to map these nursing activities into standard nursing activities using the SMASH method.
Study design: A retrospective study using NLP techniques with a unidirectional mapping strategy called SMASH.
Methods: The unstructured free-text nursing activities recorded in the Medicine, Neurology and Gastroenterology inpatient units of the Agostino Gemelli IRCCS University Hospital Foundation, Rome, Italy were collected for 6 months in 2018. Data were analyzed by three phases: a) text summarization component with NLP techniques, b) a consensus analysis by four experts to detect the category of word stems, and c) cross-mapping with SMASH. The SMASH method calculated the string comparison, similarity and distance of words through the Levenshtein distance (LD), Jaro-Winker distance and the cross-mapping's cut-offs: map [0.80-1.00] with < 13 LD, partial-map [0.50-0.79] with <13 LD and no map [0.0-0.49] with >13 LD.
Results: During the study period, 491 patient records were assessed. 548 different unstructured free-text nursing activities were recorded by nurses. 451 unstructured free-text nursing activities (82.3%) were mapped to standard PAI nursing activities, 47 (8.7%) were partial mapped, while 50 (9.0%) were not mapped. This automated mapping yielded recall of 0.95%, precision of 0.94%, accuracy of 0.91%, F-measure of 0.96. The F-measure indicates good reliability of this automated procedure in cross-mapping.
Conclusions: Lexical similarities between unstructured free-text nursing activities and standard nursing activities were found, NLP with the SMASH method is a feasible approach to extract data related to nursing concepts that are not recorded through structured data entry.
KW - Cross-mapping
KW - clinical nursing informationsystem
KW - natural language processing
KW - nursing activities
KW - professional assessment instrument
KW - standardized nursing terminology
KW - Cross-mapping
KW - clinical nursing informationsystem
KW - natural language processing
KW - nursing activities
KW - professional assessment instrument
KW - standardized nursing terminology
UR - http://hdl.handle.net/10807/199541
U2 - 10.7416/ai.2022.2517
DO - 10.7416/ai.2022.2517
M3 - Article
SN - 1120-9135
VL - 2022
SP - N/A-N/A
JO - Annali di igiene : medicina preventiva e di comunita
JF - Annali di igiene : medicina preventiva e di comunita
ER -