CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Friday, June 27, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

New method can teach AI to admit uncertainty

June 26, 2025
158
0

June 26, 2025

The GIST New method can teach AI to admit uncertainty

Related Post

AI models shrink to fit tiny devices, enabling smarter IoT sensors

AI models shrink to fit tiny devices, enabling smarter IoT sensors

June 26, 2025
AI blunders: Six-finger hands, two suns and Jesus Christ on a surfboard in a stormy sea

AI blunders: Six-finger hands, two suns and Jesus Christ on a surfboard in a stormy sea

June 26, 2025
Stephanie Baum

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

preprint

trusted source

proofread

Teaching AI to admit uncertainty
DeepSeek R1-32B’s accuracy is a function of compute budget and confidence threshold. Credit: arXiv (2025). DOI: 10.48550/arxiv.2502.13962

In high-stakes situations like health care—or weeknight "Jeopardy!"—it can be safer to say "I don't know" than to answer incorrectly. Doctors, game show contestants, and standardized test-takers understand this, but most artificial intelligence applications still prefer to give a potentially wrong answer rather than admit uncertainty.

Johns Hopkins computer scientists think they have a solution: a new method that allows AI models to spend more time thinking through problems and uses a confidence score to determine when the AI should say "I don't know" rather than risking a wrong answer—crucial for high-stakes domains like medicine, law, or engineering.

The work appears on the arXiv preprint server, and the research team will present its findings at the 63rd Annual Meeting of the Association for Computational Linguistics, to be held July 27 through Aug. 1 in Vienna, Austria.

"It all started when we saw that cutting-edge large language models spend more time thinking to solve harder problems. So we wondered—can this additional thinking time also help these models determine whether or not a problem has been solved correctly so they can report that back to the user?" says first author William Jurayj, a Ph.D. student studying computer science who is affiliated with the Whiting School of Engineering's Center for Language and Speech Processing.

To investigate, the team had large language models generate reasoning chains of different lengths as they answered difficult math problems and then measured how the chain length affected both the model's final answer and its confidence in it. The researchers had the models answer only when their confidence exceeded a given threshold—meaning "I don't know" was an acceptable response.

They found that thinking more generally improves models' accuracy and confidence. But even with plenty of time to consider, models can still make wild guesses or give wrong answers, especially without penalties for incorrect responses. In fact, the researchers found that when they set a high bar for confidence and let models think for even longer, the models' accuracy actually decreased.

"This happens because answer accuracy is only part of a system's performance," Jurayj explains. "When you demand high confidence, letting the system think longer means it will provide more correct answers and more incorrect answers. In some settings, the extra correct answers are worth the risk. But in other, high-stakes environments, this might not be the case."

Motivated by this finding, the team suggested three different "odds" settings to penalize wrong answers: exam odds, where there's no penalty for an incorrect answer; "Jeopardy!" odds, where correct answers are rewarded at the same rate as incorrect ones are penalized; and high-stakes odds, where an incorrect answer is penalized more harshly than a correct answer is rewarded.

They found that under stricter odds, a model should decline to answer a question if it isn't confident enough in its answer after expending its computing budget. And at higher confidence thresholds, this will mean that more questions go unanswered—but that isn't necessarily a bad thing.

"A student might be mildly annoyed to wait 10 minutes only to find out that she needs to solve a math problem herself because the AI model is unsure," Jurayj says. "But in high-stakes environments, this is infinitely preferable to waiting five minutes for an answer that looks correct but is not."

Now, the team is encouraging the greater AI research community to report their models' question-answering performance under exam and "Jeopardy!" odds so that everyone can benefit from AI with better-calibrated confidence.

"We hope the research community will accept our invitation to report performance in settings with non-zero costs for incorrect answers, as this will naturally motivate the development of better methods for uncertainty quantification," says Jurayj.

More information: William Jurayj et al, Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering, arXiv (2025). DOI: 10.48550/arxiv.2502.13962

Journal information: arXiv Provided by Johns Hopkins University Citation: New method can teach AI to admit uncertainty (2025, June 26) retrieved 26 June 2025 from https://techxplore.com/news/2025-06-method-ai-uncertainty.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

As LLMs grow bigger, they're more likely to give wrong answers than admit ignorance 5 shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

AI models shrink to fit tiny devices, enabling smarter IoT sensors
AI

AI models shrink to fit tiny devices, enabling smarter IoT sensors

June 26, 2025
0

June 26, 2025 The GIST AI models shrink to fit tiny devices, enabling smarter IoT sensors Sadie Harley scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:...

Read moreDetails
AI blunders: Six-finger hands, two suns and Jesus Christ on a surfboard in a stormy sea

AI blunders: Six-finger hands, two suns and Jesus Christ on a surfboard in a stormy sea

June 26, 2025
Can academics use AI to write journal papers? What the guidelines say

Can academics use AI to write journal papers? What the guidelines say

June 26, 2025
Engineers create first AI model specialized for chip design language

Engineers create first AI model specialized for chip design language

June 26, 2025
Interactive virtual companion to accelerate discoveries at scientific user facilities

Interactive virtual companion to accelerate discoveries at scientific user facilities

June 26, 2025
US judge sides with Meta in AI training copyright case

US judge sides with Meta in AI training copyright case

June 26, 2025
Mattel and OpenAI have partnered up. Here’s why parents should be concerned about AI in toys

Mattel and OpenAI have partnered up. Here’s why parents should be concerned about AI in toys

June 25, 2025

Recent News

The Steam Summer Sale is live with a fresh batch of big discounts

The Steam Summer Sale is live with a fresh batch of big discounts

June 26, 2025

Ripple vs SEC Legal Battle Continues Pending Key Ruling From Judge Torres

June 26, 2025
AI models shrink to fit tiny devices, enabling smarter IoT sensors

AI models shrink to fit tiny devices, enabling smarter IoT sensors

June 26, 2025
Apple details new fee structures for App Store payments in the EU

Apple details new fee structures for App Store payments in the EU

June 26, 2025

TOP News

  • Google’s new AI Core update for Pixel 8 Pro will boost its powers and performance

    559 shares
    Share 224 Tweet 140
  • The best Android phones for 2023

    573 shares
    Share 229 Tweet 143
  • My go-to robot vacuum and mop is still $455 off following Cyber Monday

    549 shares
    Share 220 Tweet 137
  • How OpenAI’s ChatGPT has changed the world in just a year

    557 shares
    Share 223 Tweet 139
  • Machine learning method for early fault detection could make lithium-ion batteries safer

    534 shares
    Share 214 Tweet 134
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved