Teaching AI models to say ‘I’m not sure’ in cases of calibration errors

Confidence is persuasive. In artificial intelligence systems, it is often misleading. Today's most capable reasoning models share a trait with the loudest voice in the room: They deliver every answer with the same unshakable certainty, whether they're right or guessing. Researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) have now traced that overconfidence to a specific flaw in how these models are trained, and developed a method that fixes it without giving up any accuracy. The team's research is published on the arXiv preprint server.