TY - JOUR
T1 - Artificial Intelligence and the Illusion of Understanding: A Systematic Review of Theory of Mind and Large Language Models
AU - Marchetti, Antonella
AU - Manzi, Federico
AU - Riva, Giuseppe
AU - Gaggioli, Andrea
AU - Massaro, Davide
PY - 2025
Y1 - 2025
N2 - : The development of Large Language Models (LLMs) has sparked significant debate regarding their capacity for Theory of Mind (ToM)-the ability to attribute mental states to oneself and others. This systematic review examines the extent to which LLMs exhibit Artificial ToM (AToM) by evaluating their performance on ToM tasks and comparing it with human responses. While LLMs, particularly GPT-4, perform well on first-order false belief tasks, they struggle with more complex reasoning, such as second-order beliefs and recursive inferences, where humans consistently outperform them. Moreover, the review underscores the variability in ToM assessments, as many studies adapt classical tasks for LLMs, raising concerns about comparability with human ToM. Most evaluations remain constrained to text-based tasks, overlooking embodied and multimodal dimensions crucial to human social cognition. This review discusses the "illusion of understanding" in LLMs for two primary reasons: First, their lack of the developmental and cognitive mechanisms necessary for genuine ToM, and second, methodological biases in test designs that favor LLMs' strengths, limiting direct comparisons with human performance. The findings highlight the need for more ecologically valid assessments and interdisciplinary research to better delineate the limitations and potential of AToM. This set of issues is highly relevant to psychology, as language is generally considered just one component in the broader development of human ToM, a perspective that contrasts with the dominant approach in AToM studies. This discrepancy raises critical questions about the extent to which human ToM and AToM are comparable.
AB - : The development of Large Language Models (LLMs) has sparked significant debate regarding their capacity for Theory of Mind (ToM)-the ability to attribute mental states to oneself and others. This systematic review examines the extent to which LLMs exhibit Artificial ToM (AToM) by evaluating their performance on ToM tasks and comparing it with human responses. While LLMs, particularly GPT-4, perform well on first-order false belief tasks, they struggle with more complex reasoning, such as second-order beliefs and recursive inferences, where humans consistently outperform them. Moreover, the review underscores the variability in ToM assessments, as many studies adapt classical tasks for LLMs, raising concerns about comparability with human ToM. Most evaluations remain constrained to text-based tasks, overlooking embodied and multimodal dimensions crucial to human social cognition. This review discusses the "illusion of understanding" in LLMs for two primary reasons: First, their lack of the developmental and cognitive mechanisms necessary for genuine ToM, and second, methodological biases in test designs that favor LLMs' strengths, limiting direct comparisons with human performance. The findings highlight the need for more ecologically valid assessments and interdisciplinary research to better delineate the limitations and potential of AToM. This set of issues is highly relevant to psychology, as language is generally considered just one component in the broader development of human ToM, a perspective that contrasts with the dominant approach in AToM studies. This discrepancy raises critical questions about the extent to which human ToM and AToM are comparable.
KW - Artificial Intelligence (AI)
KW - Large Language Models (LLMs)
KW - Theory of Mind (ToM)
KW - social reasoning
KW - Artificial Intelligence (AI)
KW - Large Language Models (LLMs)
KW - Theory of Mind (ToM)
KW - social reasoning
UR - https://publicatt.unicatt.it/handle/10807/313666
UR - https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=105004917882&origin=inward
UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105004917882&origin=inward
U2 - 10.1089/cyber.2024.0536
DO - 10.1089/cyber.2024.0536
M3 - Article
SN - 2152-2715
SP - 1
EP - 10
JO - Cyberpsychology, Behavior, and Social Networking
JF - Cyberpsychology, Behavior, and Social Networking
IS - N/A
ER -