July 2, 2025
The GIST AI might now be as good as humans at detecting emotion, political leaning and sarcasm in online conversations
Lisa Lock
scientific editor
Andrew Zinin
lead editor
Editors' notes
This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:
fact-checked
trusted source
written by researcher(s)
proofread

When we write something to another person, over email or perhaps on social media, we may not state things directly, but our words may instead convey a latent meaning—an underlying subtext. We also often hope that this meaning will come through to the reader.
But what happens if an artificial intelligence (AI) system is at the other end, rather than a person? Can AI, especially conversational AI, understand the latent meaning in our text? And if so, what does this mean for us?
Latent content analysis is an area of study concerned with uncovering the deeper meanings, sentiments and subtleties embedded in text. For example, this type of analysis can help us grasp political leanings present in communications that are perhaps not obvious to everyone.
Understanding how intense someone's emotions are or whether they're being sarcastic can be crucial in supporting a person's mental health, improving customer service, and even keeping people safe at a national level.
These are only some examples. We can imagine benefits in other areas of life, like social science research, policy-making and business. Given how important these tasks are—and how quickly conversational AI is improving—it's essential to explore what these technologies can (and can't) do in this regard.
Work on this issue is only just starting. Current work shows that ChatGPT has had limited success in detecting political leanings on news websites. Another study that focused on differences in sarcasm detection between different large language models—the technology behind AI chatbots such as ChatGPT—showed that some are better than others.
Finally, a study showed that LLMs can guess the emotional "valence" of words—the inherent positive or negative "feeling" associated with them. Our new study published in Scientific Reports tested whether conversational AI, inclusive of GPT-4—a relatively recent version of ChatGPT—can read between the lines of human-written texts.
The goal was to find out how well LLMs simulate understanding of sentiment, political leaning, emotional intensity and sarcasm—thus encompassing multiple latent meanings in one study. This study evaluated the reliability, consistency and quality of seven LLMs, including GPT-4, Gemini, Llama-3.1-70B and Mixtral 8 × 7B.
We found that these LLMs are about as good as humans at analyzing sentiment, political leaning, emotional intensity and sarcasm detection. The study involved 33 human subjects and assessed 100 curated items of text.
For spotting political leanings, GPT-4 was more consistent than humans. That matters in fields like journalism, political science, or public health, where inconsistent judgment can skew findings or miss patterns.
GPT-4 also proved capable of picking up on emotional intensity and especially valence. Whether a tweet was composed by someone who was mildly annoyed or deeply outraged, the AI could tell—although, someone still had to confirm if the AI was correct in its assessment. This was because AI tends to downplay emotions. Sarcasm remained a stumbling block both for humans and machines.
The study found no clear winner there—hence, using human raters doesn't help much with sarcasm detection.
Why does this matter? For one, AI like GPT-4 could dramatically cut the time and cost of analyzing large volumes of online content. Social scientists often spend months analyzing user-generated text to detect trends. GPT-4, on the other hand, opens the door to faster, more responsive research—especially important during crises, elections or public health emergencies.
Journalists and fact-checkers might also benefit. Tools powered by GPT-4 could help flag emotionally charged or politically slanted posts in real time, giving newsrooms a head start.
There are still concerns. Transparency, fairness and political leanings in AI remain issues. However, studies like this one suggest that when it comes to understanding language, machines are catching up to us fast—and may soon be valuable teammates rather than mere tools.
Although this work doesn't claim conversational AI can replace human raters completely, it does challenge the idea that machines are hopeless at detecting nuance.
Our study's findings do raise follow-up questions. If a user asks the same question of AI in multiple ways—perhaps by subtly rewording prompts, changing the order of information, or tweaking the amount of context provided—will the model's underlying judgements and ratings remain consistent?
Further research should include a systematic and rigorous analysis of how stable the models' outputs are. Ultimately, understanding and improving consistency is essential for deploying LLMs at scale, especially in high-stakes settings.
Provided by The Conversation
This article is republished from The Conversation under a Creative Commons license. Read the original article.
Citation: AI might now be as good as humans at detecting emotion, political leaning and sarcasm in online conversations (2025, July 2) retrieved 2 July 2025 from https://techxplore.com/news/2025-07-ai-good-humans-emotion-political.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.
Explore further
Analysis reveals that most major open- and closed-source LLMs tend to lean left when asked politically charged questions shares
Feedback to editors