Don’t panic: ‘Humanity’s last exam’ has begun

When artificial intelligence systems began acing long-standing academic assessments, researchers realized they had a problem: the tests were too easy. Popular evaluations, such as the Massive Multitask Language Understanding (MMLU) exam, once considered formidable, are no longer challenging enough to meaningfully test advanced AI systems.