Advanced AI models suffer a near-total collapse on classic psychology test as cognitive demands increase

sanitation@lemmy.today · 3 days ago

Advanced AI models suffer a near-total collapse on classic psychology test as cognitive demands increase

scratchee@feddit.uk · 3 days ago

Afaik that is handled through tool use in modern models (ie they didn’t learn to do maths, they learnt to use a calculator), assuming that’s true and I haven’t missed some advance, their conclusions are likely still relevant

Edit: though the article does seem to discard the chain of thought techniques a little readily, feels like they could come close to fitting the role of executive control, but perhaps that’s just the article lacking detail from the original work.

Monument@piefed.world · 3 days ago

My high school math teachers would be so disappointed in them.

scratchee@feddit.uk · 3 days ago

If I could wire a calculator into my brain I would have cheated on all the maths tests tbf

toynbee@piefed.social · 1 day ago

This was surprisingly hard to find in an easily shareable form.

MangoCats@feddit.it · 3 days ago

What I see in the modern models is that you can often ask them to write a program or script to do a task and they can do that successfully much better than doing the task itself directly - once they have debugged the program it is usually 100% reliable for the specified tasks. Ask them to do those simple tasks directly and you get all kinds of creatively wrong answers.