When tested with a classic psychological assessment, advanced AI models experienced a total breakdown in focus. A new PNAS Nexus study suggests these systems lack the human-like executive control necessary to override automatic responses and maintain complex goals.
Given that the LLMs could follow the short lists of words well but not the longer lists, and that they were processing images, not text, I think it’s more likely that their context just filled up and they forgot the original instructions (or they were assigned a lower weight in the computation).