CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Tuesday, July 29, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Researchers test the trustworthiness of AI by teaching it to play sudoku

July 28, 2025
154
0

July 28, 2025

The GIST Researchers test the trustworthiness of AI by teaching it to play sudoku

Related Post

LA may not have flying cars, but more food delivery bots are coming

LA may not have flying cars, but more food delivery bots are coming

July 29, 2025
How digital twins can accelerate the global transition from fossil fuels to clean energy

How digital twins can accelerate the global transition from fossil fuels to clean energy

July 29, 2025
Stephanie Baum

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

sudoku
Credit: Pixabay/CC0 Public Domain

Artificial intelligence tools called large language models (LLMs), such as OpenAI's ChatGPT or Google's Gemini, can do a lot these days—dispensing relationship advice, crafting texts to get you out of social obligations and even writing science articles.

But can they also solve your morning sudoku?

In a new study, a team of computer scientists from the University of Colorado Boulder decided to find out. The group created nearly 2,300 original sudoku puzzles, which require players to enter numbers into a grid following certain rules, then asked several AI tools to fill them in.

The results were a mixed bag. While some of the AI models could solve easy sudokus, even the best struggled to explain how they solved them—giving garbled, inaccurate or even surreal descriptions of how they arrived at their answers. The results raise questions about the trustworthiness of AI-generated information, said study co-author Maria Pacheco.

"For certain types of sudoku puzzles, most LLMs still fall short, particularly in producing explanations that are in any way usable for humans," said Pacheco, assistant professor in the Department of Computer Science. "Why did it come up with that solution? What are the steps you need to take to get there?"

She and her colleagues have published their results in Findings of the Association for Computational Linguistics.

The researchers aren't trying to cheat at puzzles. Instead, they're using these logic exercises to explore how AI platforms think. The results could one day lead to more reliable and trustworthy computer programs, said study co-author Fabio Somenzi, professor in the Department of Electrical, Computer and Energy Engineering.

"Puzzles are fun, but they're also a microcosm for studying the decision-making process in machine learning," he said. "If you have AI prepare your taxes, you want to be able to explain to the IRS why the AI wrote what it wrote."

Daily puzzle

Somenzi, who is a self-described sudoku fan, noted that the puzzles tap into a very human way of thinking. Filling out a sudoku grid requires puzzlers to learn and follow a set of logical rules. For example, you can't enter a two in an empty square if there's already a two in the same row or column.

Most LLMs today struggle at that kind of thinking, in large part because of how they're trained.

To build ChatGPT, for example, programmers first fed the AI almost everything that had ever been written on the internet. When ChatGPT responds to a question, it predicts the most likely response based on all that data—almost like a computer version of rote memory.

"What they do is essentially predict the next word," Pacheco said. "If you have the start to a sentence, what word comes next? They do that by referring to every sentence in the English language that they can get their hands on."

Pacheco, Somenzi and their colleagues have joined a growing effort in computer science to merge those two ways of thinking—combining the memory of an LLM with a human brain's capacity for logic, a pursuit known as "neurosymbolic" AI.

Anirudh Maiya and Razan Alghamdi, both former graduate students at CU Boulder, were also co-authors of the new paper.

How's the weather?

To begin, the researchers created sudoku puzzles of varying difficulty using a six-by-six grid (a simpler version of the nine-by-nine puzzles usually found online).

They then gave the puzzles to a series of AI models, including the preview of OpenAI's o1 model—which, in 2023, represented the state-of-the-art for its kind of LLM.

The o1 model led the pack, solving roughly 65% of the sudoku puzzles correctly. Then the team asked the AI platforms to explain how they got their answers. That's when the results got really wild.

"Sometimes, the AI explanations made up facts," said Ashutosh Trivedi, a co-author of the study and associate professor of computer science at CU Boulder. "So it might say, "There cannot be a two here because there's already a two in the same row," but that wasn't the case."

In a telling example, the researchers were talking to one of the AI tools about solving sudoku when, for unknown reasons, it responded with a weather forecast.

"At that point, the AI had gone berserk and was completely confused," Somenzi said.

The researchers hope to design their own AI system that can do it all—solving complicated puzzles and explaining how. They're starting with another type of puzzle called hitori, which—like sudoku—involves a grid of numbers.

"People talk about the emerging capabilities of AI where they end up being able to solve things that you wouldn't expect them to solve," Pacheco said. "At the same time, it's not surprising that they're still bad at a lot of tasks."

More information: Anirudh Maiya et al, Explaining Puzzle Solutions in Natural Language: An Exploratory Study on 6×6 Sudoku (2025)

Provided by University of Colorado at Boulder Citation: Researchers test the trustworthiness of AI by teaching it to play sudoku (2025, July 28) retrieved 28 July 2025 from https://techxplore.com/news/2025-07-trustworthiness-ai-play-sudoku.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

This puzzle game shows kids how they're smarter than AI 0 shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

LA may not have flying cars, but more food delivery bots are coming
AI

LA may not have flying cars, but more food delivery bots are coming

July 29, 2025
0

July 28, 2025 The GIST LA may not have flying cars, but more food delivery bots are coming Sadie Harley scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the...

Read moreDetails
How digital twins can accelerate the global transition from fossil fuels to clean energy

How digital twins can accelerate the global transition from fossil fuels to clean energy

July 29, 2025
Scientists use AI-powered robot to assemble cyborg insects for use in search and rescue efforts

Scientists use AI-powered robot to assemble cyborg insects for use in search and rescue efforts

July 29, 2025
AI can see clearly now, when it comes to energy storage

AI can see clearly now, when it comes to energy storage

July 28, 2025
AI agents—here’s what to know about what they can do and how they can go wrong

AI agents—here’s what to know about what they can do and how they can go wrong

July 28, 2025
Curved neural networks enable AI memory recall through geometric design

Curved neural networks enable AI memory recall through geometric design

July 28, 2025
Key Advantages of the Elon Musk Quantum AI Trading Platform

Key Advantages of the Elon Musk Quantum AI Trading Platform

July 28, 2025

Recent News

Billionaire Investor Ray Dalio Issues Critical Alert on Bitcoin and Gold – “The US is Spending More Than Its Income…”

July 29, 2025
VPNs are booming in the UK after age restriction laws, but free options carry big risks

VPNs are booming in the UK after age restriction laws, but free options carry big risks

July 29, 2025
LA may not have flying cars, but more food delivery bots are coming

LA may not have flying cars, but more food delivery bots are coming

July 29, 2025
Nasdaq-listed Mill City Ventures earmarks $441M toward Sui treasury

Nasdaq-listed Mill City Ventures earmarks $441M toward Sui treasury

July 29, 2025

TOP News

  • The AirPods 4 are still on sale at a near record low price

    The AirPods 4 are still on sale at a near record low price

    533 shares
    Share 213 Tweet 133
  • Ripple Partners With Ctrl Alt to Expand Custody Footprint Into Middle East

    533 shares
    Share 213 Tweet 133
  • Cyberpunk 2077: Ultimate Edition comes to the Mac on July 17

    533 shares
    Share 213 Tweet 133
  • HBO confirms The Last of Us season 3 will arrive in 2027

    533 shares
    Share 213 Tweet 133
  • Reddit is back online after a brief outage

    533 shares
    Share 213 Tweet 133
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved