Tool devised for detecting AI that scores high on accuracy, low on false accusations

July 10, 2025

The GIST Tool devised for detecting AI that scores high on accuracy, low on false accusations

Urgent need for ‘global approach’ on AI regulation: UN tech chief

July 27, 2025

China urges global consensus on balancing AI development, security

July 26, 2025

Gaby Clark

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

preprint

trusted source

proofread

Researchers devise tool for detecting AI that scores high on accuracy, low on false accusations — Credit: University of Michigan

Detecting writing via artificial intelligence is a tricky dance: Doing it right means being effective at identifying it while being careful not to falsely accuse a human of employing it. And few tools strike the right balance.

A team of researchers at the University of Michigan say they have devised a new way to tell whether a piece of text written by AI passes both tests—something that could be especially useful in academia and public policy as AI content proliferates and becomes more indistinguishable from human-generated content.

The team calls its tool "Liketropy," which is inspired by the theoretical backbone of its method: It blends likelihood and entropy, two statistical ideas that power its test.

They designed "zero-shot statistical tests," which can determine whether a piece of writing was written by a human or a Large Language Model without requiring prior training on examples of each.

The current tool focuses on LLMs, a specific type of AI for producing text. It uses statistical properties of the text itself, such as how surprising or predictable the words are, to decide if it looks more human or machine-generated.

In testing on large-scale datasets—even those whose models were hidden from the public or where AI-generated text was designed to surpass detectors—researchers say their tool performed well. When the test is designed with specific LLMs in mind as potential generators of the text, it achieves an average accuracy above 96% and a false accusation rate as low as 1%.

"We were very intentional about not creating a detector that just points fingers. AI detectors can be overconfident, and that's risky—especially in education and policy," said Tara Radvand, a doctoral student at U-M's Ross School of Business who co-authored the study. "Our goal was to be cautious about false accusations while still flagging AI-generated content with statistical confidence."

Among the researchers' unexpected findings were how little they needed to know about a language model to be capable of catching it. The test worked and still performed well, challenging the assumption that detection must rely on access, training or cooperation, Radvand said.

The team was motivated by fairness, particularly for international students and non-native English speakers. Emerging literature shows that students who speak English as a second language may be unfairly flagged for "AI-like" writing because of tone or sentence structure.

"Our tool can help these students self-check their writing in a low-stakes, transparent way before submission," Radvand said.

As for next steps, she and her colleagues plan to expand their demo into a tool that can be adapted into different domains. They've learned that fields such as law and science, as well as applications like college admissions, have different thresholds in the "cautious-effective" trade-off.

A critical application for AI detectors is to reduce the spread of misinformation on social media. Some tools intentionally train LLMs to adopt extreme beliefs and spread misinformation on social media to manipulate public opinion.

Because these systems can generate large-scale false content, the researchers say it's crucial to develop reliable detection tools that can flag such content and comments. Early identification helps platforms limit the reach of harmful narratives and protect the integrity of public discourse.

They also plan to speak with U-M business and university leaders about the prospect of adopting their tool as a complement to U-M GPT and the Maizey AI assistant to verify whether text was generated by these tools versus an external AI model, such as ChatGPT.

Liketropy received a Best Presentation Award at the Michigan Student Symposium for Interdisciplinary Statistical Sciences, an annual event organized by graduate students. It was also featured by Paris Women in Machine Learning and Data Science, a France-based community of women interested in machine learning and data science that hosts various events.

The research is published on the arXiv preprint server.

More information: Tara Radvand et al, Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities, arXiv (2025). DOI: 10.48550/arxiv.2501.02406

HuggingFace: huggingface.co/spaces/tararad/ … ketropy-LLM-Detector

Journal information: arXiv Provided by University of Michigan Citation: Tool devised for detecting AI that scores high on accuracy, low on false accusations (2025, July 10) retrieved 10 July 2025 from https://techxplore.com/news/2025-07-tool-ai-scores-high-accuracy.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

New study reveals bias in AI text detection tools impacts academic publishing fairness shares

Feedback to editors

Tool devised for detecting AI that scores high on accuracy, low on false accusations

Urgent need for ‘global approach’ on AI regulation: UN tech chief

China urges global consensus on balancing AI development, security

Related Posts

Urgent need for ‘global approach’ on AI regulation: UN tech chief

China urges global consensus on balancing AI development, security

Trump’s AI plan calls for massive data centers. Here’s how it may affect energy in the US

Tradition meets AI in Nishijinori weaving style from Japan’s ancient capital

AI tackles notoriously complex equations, enabling faster advances in drug and material design

AI will soon be able to audit all published research—what will that mean for public trust in science?

A human-inspired pathfinding approach to improve robot navigation

Recent News

Users Are Unstaking Their ETH in Unusual Amounts on Ethereum – What Does This Mean and Why Is It Happening? Cathie Wood Weighs In

Urgent need for ‘global approach’ on AI regulation: UN tech chief

Bitcoin Cash Surges Past $580 as Analysts Predict Breakout Toward $620–$680 Range

Nasdaq-Listed Company Announces XRP Reserve – But Doubts Remain

TOP News

Bitcoin Sees Long-Term Holders Sell As Short-Term Buyers Step In – Sign Of Rally Exhaustion?

The AirPods 4 are still on sale at a near record low price

Ripple Partners With Ctrl Alt to Expand Custody Footprint Into Middle East

Cyberpunk 2077: Ultimate Edition comes to the Mac on July 17

HBO confirms The Last of Us season 3 will arrive in 2027

Welcome Back!

Retrieve your password