October 16, 2025
The GIST Why large language models are bad at imitating people
Lisa Lock
scientific editor
Robert Egan
associate editor
Editors' notes
This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:
fact-checked
peer-reviewed publication
trusted source
proofread

Large language models like ChatGPT and Copilot are useful for many things. However, they are not yet good enough to imitate the way people speak.
It is easy to be impressed by artificial intelligence. Many people use large language models such as ChatGPT, Copilot and Perplexity to help solve a variety of tasks or simply for entertainment purposes.
But just how good are these large language models at pretending to be human? Not very, according to recent research.
"Large language models speak differently than people do," said Associate Professor Lucas Bietti from the Department of Psychology.
Bietti was one of the authors of a research article recently published in Cognitive Science. The lead author is Eric Mayor from the University of Basel, while the final author is Adrian Bangerter from the University of Neuchâtel.
Tested several models
The large language models the researchers tested were ChatGPT-4, Claude Sonnet 3.5, Vicuna and Wayfarer. First, they independently compared transcripts of phone conversations between humans with simulated conversations in the large language models. They then checked whether other people could distinguish between the human phone conversations and those of the language models.
For the most part, people are not fooled—or at least not yet. So what are the language models doing wrong?
Too much imitation
When people talk to each other, there is a certain amount of imitation that goes on. We slightly adapt our words and the conversation according to the other person. However, the imitation is usually quite subtle.
"Large language models are a bit too eager to imitate, and this exaggerated imitation is something that humans can pick up on," Bietti said.
This is called "exaggerated alignment." But that is not all.
Incorrect use of filler words
Movies with bad scripts usually have conversations that sound artificial. In such cases, the scriptwriters have often forgotten that conversations do not only consist of the necessary content words. In real, everyday conversations, most of us include small words called 'discourse markers."
These are words like "so," "well," 'like" and "anyway."
These words have a social function because they can signal interest, belonging, attitude or meaning to the other person. In addition, they can also be used to structure the conversation.
Large language models are still terrible at using these words. "The large language models use these small words differently, and often incorrectly," said Bietti.
This helps to expose them as non-human. But there is more.
Opening and closing features
When you start talking to someone, you probably do not get straight to the point. Instead, you might start by saying "hey" or "so, how are you doing?" or "oh, fancy seeing you here." People tend to engage in small talk before moving on to what they actually want to talk about.
This shift from introduction to business takes place more or less automatically for humans, without being explicitly stated.
"This introduction, and the shift to a new phase of the conversation, are also difficult for large language models to imitate," said Bietti.
The same applies to the end of the conversation. We usually do not end a conversation abruptly as soon as the information has been conveyed to the other person. Instead, we often end the conversation with phrases like "alright, then," "okay," "talk to you later," or "see you soon."
Large language models do not quite manage that part either.
Better in the future? Probably
Altogether, these features cause so much trouble for the large language models that the conclusion is clear: "Today's large language models are not yet able to imitate humans well enough to consistently fool us," said Bietti.
Developments in this field are now progressing so rapidly that large language models will most likely be able to do this quite soon—at least if we want them to. Or will they?
"Improvements in large language models will most likely manage to narrow the gap between human conversations and artificial ones, but key differences will probably remain," concluded Bietti.
For the time being, large language models are still not human-like enough to fool us. At least not every time.
More information: Eric Mayor et al, Can Large Language Models Simulate Spoken Human Conversations? Cognitive Science (2025). DOI: 10.1111/cogs.70106
Journal information: Cognitive Science Provided by Norwegian University of Science and Technology Citation: Why large language models are bad at imitating people (2025, October 16) retrieved 16 October 2025 from https://techxplore.com/news/2025-10-large-language-bad-imitating-people.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.
Explore further
Q&A: Can AI persuade you to go vegan—or harm yourself?
Feedback to editors