CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Saturday, September 6, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Where AI models fall short in mimicking the expressiveness of human speech

September 5, 2025
154
0

September 5, 2025

The GIST Where AI models fall short in mimicking the expressiveness of human speech

Related Post

Large language models can execute complete ransomware attacks autonomously, research shows

Large language models can execute complete ransomware attacks autonomously, research shows

September 5, 2025
Retraining AI to fortify itself against rogue rewiring even after key layers are removed

Retraining AI to fortify itself against rogue rewiring even after key layers are removed

September 5, 2025
Lisa Lock

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

Where AI models fall short in mimicking the expressiveness of human speech
Through the Penn Undergraduate Research Mentoring Program, students Ethan Yang, Kevin Li, and Henry Huang worked with linguistics professor Jianjing Kuang to study the ability of AI models to replicate the expressiveness of human speech. Credit: University of Pennsylvania

It's not just what is said but how it's articulated that shapes the meaning of human communication, and people use intonation to highlight the most important part of a sentence. Take, for instance, the sentence "Molly mailed a melon." If someone asks, "Who mailed the melon?" people are most likely to stress "Molly mailed a melon." If someone inquired what Molly did with the melon, it would be "Molly mailed a melon." If the question was what Molly mailed, the response is "Molly mailed a melon."

But if you ask any of these questions to an artificial intelligence model that is capable of speech, it's a different story. Jianjing Kuang, associate professor of linguistics in the School of Arts & Sciences and director of the Penn Phonetics Laboratory, says while AI robots can articulate a word accurately, the technology to capture intonation, known as prosodic focus, "is not quite there yet."

This summer, she mentored three undergraduate students—Kevin Li and Henry Huang, second-year computer science students and Ethan Yang, a third-year mechanical engineering major—in a research project comparing human and AI speech in speech production and perception. This is part of the Penn Undergraduate Research Mentoring Program (PURM), a 10-week summer research opportunity through the Center for Undergraduate Research and Fellowships that comes with a $5,000 award.

"I've always been interested in linguistics and phonetics, but this is a really good opportunity for me to do hands-on research," says Li, who is from Kansas City, Kansas. Huang, who is from Shenzhen, China, says the experience taught him how to design an experiment and analyze data.

Inputting different contexts, the students generated the sentence "Molly mailed a melon" in 15 AI text-to-speech (TTS) platforms—from major companies like OpenAI, Google, and Meta to smaller ones like Sesame AI and Eleven Labs. They also captured audio from human volunteers in Kuang's recording studio to compare AI-generated speech to the same speech from humans.

Yang, a third-year mechanical engineering major from Diamond Bar, California, says this project taught him how to control intonation in TTS models. The team then analyzed acoustic measures such as pitch, intensity, and duration of words using the software Praat.

They found that, compared to human production, most of the TTS models failed to focus on the correct place. As an example, Li pulled up a graph showing that when prompted to focus on the word "mailed," the average word duration is significantly longer from humans than from any of the speech robots.

The students found "huge variability among the models," Kuang says. Some models were explicitly instructed to emphasize a certain word but could not, while others, such as OpenAI and Google Gemini, were more capable. Some models emphasized more than one word, one turned the sentence into a question mark, and another didn't even finish the sentence. Another interesting finding, Kuang says, is that speech robots had an easier time emphasizing "Molly" than words later in the sentence.

In addition to speech production, the students ran a perception experiment, asking human listeners to rate the naturalness of an audio clip and identify whether the speaker is human or AI. Kuang says the accuracy for identifying human versus AI is very high, suggesting that AI speech is still not human-like.

"The goal is to build bridges between science and industry. I do think they need us—our knowledge—to tell how good the model is and help move us closer to truly natural and expressive AI speech," she says. Kuang adds that working with AI also has implications for better understanding human speech and its uniqueness, such as why certain tasks come easily to us and how to develop better therapies for speech disorders.

Provided by University of Pennsylvania Citation: Where AI models fall short in mimicking the expressiveness of human speech (2025, September 5) retrieved 5 September 2025 from https://techxplore.com/news/2025-09-ai-fall-short-mimicking-human.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Bar chatter: Automatic speech recognition rivals humans in noisy environments shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

Large language models can execute complete ransomware attacks autonomously, research shows
AI

Large language models can execute complete ransomware attacks autonomously, research shows

September 5, 2025
0

September 5, 2025 The GIST Large language models can execute complete ransomware attacks autonomously, research shows Lisa Lock scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:...

Read moreDetails
Retraining AI to fortify itself against rogue rewiring even after key layers are removed

Retraining AI to fortify itself against rogue rewiring even after key layers are removed

September 5, 2025
Europe’s fastest supercomputer to boost AI drive

Europe’s fastest supercomputer to boost AI drive

September 5, 2025
Similarities between human and AI learning offer intuitive design insights

Similarities between human and AI learning offer intuitive design insights

September 4, 2025
Researchers discover a GPU vulnerability that could threaten AI models

Researchers discover a GPU vulnerability that could threaten AI models

September 4, 2025
RoboBallet system enables robotic arms to work together like a well-choreographed dance

RoboBallet system enables robotic arms to work together like a well-choreographed dance

September 4, 2025
OpenAI looks to online advertising deal. AI-driven ads will be hard for consumers to spot

OpenAI looks to online advertising deal. AI-driven ads will be hard for consumers to spot

September 4, 2025

Recent News

Anthropic will pay a record-breaking $1.5 billion to settle copyright lawsuit with authors

Anthropic will pay a record-breaking $1.5 billion to settle copyright lawsuit with authors

September 6, 2025
Zuckerberg caught on hot mic telling Trump ‘I wasn’t sure’ how much to promise to spend on AI in the US

Zuckerberg caught on hot mic telling Trump ‘I wasn’t sure’ how much to promise to spend on AI in the US

September 5, 2025

BREAKING: Bullish News for Cryptocurrencies from the US Senate

September 5, 2025
Unity developers can now tap into system screen reader tools on macOS and Windows

Unity developers can now tap into system screen reader tools on macOS and Windows

September 5, 2025

TOP News

  • Investment Giant 21Shares Announces New Five Altcoins Including Avalanche (AVAX)!

    570 shares
    Share 228 Tweet 143
  • God help us, Donald Trump plans to sell a phone

    570 shares
    Share 228 Tweet 143
  • WhatsApp has ads now, but only in the Updates tab

    569 shares
    Share 228 Tweet 142
  • Tron Looks to go Public in the U.S., Form Strategy Like TRX Holding Firm: FT

    570 shares
    Share 228 Tweet 143
  • AI generates data to help embodied agents ground language to 3D world

    569 shares
    Share 228 Tweet 142
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved