AI fails classic attention test, with longer word lists triggering dramatic accuracy collapse

Giving AI a classic psychological test reveals an inherent weakness in LLM decision-making abilities. Suketu Patel and colleagues explored how transformer-based machine attention differs from human attention by testing AI models on the "Stroop task," in which words for colors are printed in colored ink, and participants are asked to name the ink color of each word while ignoring its meaning.