New research suggests ChatGPT ignores article retractions and errors when used to inform literature reviews

August 5, 2025

The GIST New research suggests ChatGPT ignores article retractions and errors when used to inform literature reviews

OpenAI releases free, downloadable models in competition catch-up

August 5, 2025

AI model uncovers and reconstructs hidden multi-entity relationships

August 5, 2025

Lisa Lock

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

New research suggests ChatGPT ignores article retractions and errors when used to inform literature reviews — Average ChatGPT REF score for the 217 high-profile retracted or concerning articles. Articles are listed in ascending order of ChatGPT score. Credit: Learned Publishing (2025). DOI: 10.1002/leap.2018

A new study has examined issues of large language models (LLMs) failing to flag articles which have been retracted or are discredited when asked to evaluate their quality.

The new paper, co-authored by Professor Mike Thelwall and Dr. Irini Katsirea, is presented in the Learned Publishing journal, and is the latest output from a research project entitled "Unreliable science: unraveling the impact of mainstream media misrepresentation," which began in October 2024.

The research team identified 217 retracted or "otherwise concerning" academic studies with high altmetric scores and asked ChatGPT to evaluate their quality 30 times each.

None of the 6,510 reports that ChatGPT produced mentioned that the articles were retracted or had relevant errors, and it gave 190 of the papers relatively high scores, indicating that the articles were world-leading, internationally excellent or similar. The only criticisms that ChatGPT leveled at the lowest-scoring articles pertained to their academic weakness, not their retraction or other errors, though in five cases the topic of the article was described as "controversial."

In a follow-up investigation, 61 claims were extracted from retracted articles from the set and ChatGPT was asked 10 times whether each was true. It gave a definitive "yes" or a positive response two-thirds of the time, including for at least one statement that had been shown to be false more than a decade ago.

The research team concluded that these findings "emphasize, from an academic knowledge perspective, the importance of verifying information from LLMs when using them for information-seeking or analysis."

Professor Thelwall said, "The results of the study came as a surprise and the inability of ChatGPT to identify retracted research is concerning. I hope that the findings help those building these systems to improve them. I also hope that the research gives an additional warning to users to avoid trusting generative AI systems, even when they sound plausible and informed."

More information: Mike Thelwall et al, Does ChatGPT Ignore Article Retractions and Other Reliability Concerns?, Learned Publishing (2025). DOI: 10.1002/leap.2018

Provided by University of Sheffield Citation: New research suggests ChatGPT ignores article retractions and errors when used to inform literature reviews (2025, August 5) retrieved 5 August 2025 from https://techxplore.com/news/2025-08-chatgpt-article-retractions-errors-literature.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Retracting research is an important part of the scientific process 23 shares

Feedback to editors