Mondo Visione Worldwide Financial Markets Intelligence

FTSE Mondo Visione Exchanges Index:

BIS: Testing The Cognitive Limits Of Large Language Models

Date 04/01/2024

BIS Bulletin  |  No 83  |  
04 January 2024
PDF full text 
(652kb)
  |  9 pages

Key takeaways

  • When posed with a logical puzzle that demands reasoning about the knowledge of others and about counterfactuals, large language models (LLMs) display a distinctive and revealing pattern of failure.
  • The LLM performs flawlessly when presented with the original wording of the puzzle available on the internet but performs poorly when incidental details are changed, suggestive of a lack of true understanding of the underlying logic.
  • Our findings do not detract from the considerable progress in central bank applications of machine learning to data management, macro analysis and regulation/supervision. They do, however, suggest that caution should be exercised in deploying LLMs in contexts that demand rigorous reasoning in economic analysis.