Researchers have taken a deep dive into understanding how well large language models (LLMs) like GPT-4 grasp complex human thoughts and emotions. π€π§ This human ability, known as higher-order theory of mind (ToM), lets us think about what others believe, feel, and know in a layered way (like βI think you believe she knowsβ). π
The study introduced a new test called Multi-Order Theory of Mind Q&A to measure this skill. They tested five advanced LLMs and compared them to adult human performance. ππ©βπ¬
Key Findings:
β’ GPT-4 and Flan-PaLM perform at or near adult human levels on ToM tasks. π
β’ GPT-4 even surpasses adult performance in making 6th-order inferences! π
β’ Thereβs a clear link between the size of the model and fine-tuning in achieving these ToM abilities.
Why does this matter? Higher-order ToM is crucial for many human interactions, both cooperative and competitive. π€π These findings could greatly impact how we design user-facing AI applications, making them more intuitive and effective.
Try 6th-order inferences yourself (βI know that you think that she knows that he fears that I will believe that you understandβ), and youβll realize that humans have no business handling 7th and higher orders.
π Check out the full study for more insights: LLMs achieve adult human performance on higher-order theory of mind tasks
Add a Comment