Amazon's AI assistant struggles with diverse dialects, study finds

July 14, 2025

The GIST Amazon's AI assistant struggles with diverse dialects, study finds

New simulation system generates thousands of training examples for robotic hands and arms

July 15, 2025

Mexican voice actors demand regulation on AI voice cloning

July 15, 2025

Lisa Lock

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

Amazon’s AI assistant struggles with diverse dialects, study finds — Unsureness produced by the copy of Amazon Rufus based on GPT-4o-mini in response to 1620 distinct prompts. Credit: Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (2025). DOI: 10.1145/3715275.3732137

A new Cornell study has revealed that Amazon's AI shopping assistant, Rufus, gives vague or incorrect responses to users writing in some English dialects, such as African American English (AAE), especially when prompts contain typos.

The paper introduces a framework to evaluate chatbots for harms that occur when AI systems perform worse for users who speak or write in different dialects. The study has implications for the increasing number of online platforms that are incorporating chatbots based on large language models to provide services to users, the researchers said.

"Currently, chatbots may provide lower-quality responses to users who write in dialects. However, this doesn't have to be the case," said lead author Emma Harvey, a Ph.D. student at Cornell Tech. "If we train large language models to be robust to common dialectical features that exist outside of so-called Standard American English, we could see more equitable behavior."

The research received a Best Paper Award at the June 23–26 ACM Conference on Fairness, Accountability, and Transparency (FAccT 2025). Co-authors are Rene F. Kizilcec, associate professor of computer and information science at Cornell Ann S. Bowers College of Information Science, and Allison Koenecke, assistant professor at Cornell Tech. The paper is published in the Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency.

"Chatbots are increasingly used for high-stakes tasks, from education to government services," said Koenecke, who is also affiliated with Cornell Bowers. "We wanted to study whether users who speak and write differently—across dialects and formality levels—have comparable experiences with chatbots trained mostly on 'standard' American English."

To test their framework, the researchers audited Amazon Rufus, a chatbot in the Amazon shopping app. They used a tool called MultiVALUE to convert standard English prompts into five widely spoken dialects: AAE, Chicano English, Appalachian English, Indian English and Singaporean English. The researchers also modified these prompts to reflect real-world use by adding typos, removing punctuation and changing capitalization.

The team found Rufus more often gave low-quality answers that were vague or incorrect when prompted in dialects rather than in Standard American English (SAE). The gap widened when prompts included typos.

For example, when asked in SAE if a jacket was machine-washable, Rufus answered correctly. But when researchers rephrased the same question in AAE and without a linking verb—"this jacket machine washable?"—Rufus often failed to respond properly and instead directed users to unrelated products.

"Part of this underperformance stems from specific grammatical rules," said Koenecke. "This has serious implications for widely used chatbots like Rufus, which likely underperform for a large portion of users."

Overall, the authors advocate for dialect-aware AI auditing. They also urge developers to design systems that embrace linguistic diversity.

"Chatbots are increasingly added to educational technologies as AI tutors that support a wide range of students," said Kizilcec, who leads the Future of Learning Lab and the National Tutoring Observatory at Cornell. "Linguistic audits should become standard practice to mitigate the risk of exacerbating educational inequalities."

More information: Emma Harvey et al, A Framework for Auditing Chatbots for Dialect-Based Quality-of-Service Harms, Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (2025). DOI: 10.1145/3715275.3732137

Provided by Cornell University Citation: Amazon's AI assistant struggles with diverse dialects, study finds (2025, July 14) retrieved 14 July 2025 from https://techxplore.com/news/2025-07-amazon-ai-struggles-diverse-dialects.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Developers, educators view AI harms differently, research finds 0 shares

Feedback to editors

Amazon’s AI assistant struggles with diverse dialects, study finds

New simulation system generates thousands of training examples for robotic hands and arms

Mexican voice actors demand regulation on AI voice cloning

Related Posts

New simulation system generates thousands of training examples for robotic hands and arms

Mexican voice actors demand regulation on AI voice cloning

The forgotten 80-year-old machine that shaped the internet—and could help us survive AI

AI-powered occupancy tracking system optimizes open-plan office design

AI engineers don’t feel empowered to tackle sustainability crisis, new research suggests

AI helps stroke survivors find their voice

What a folding ruler can tell us about neural networks

Recent News

Bitcoin Trading at Centre of High-Profile New Zealand Homicide Case

Form makes its smart swimming goggles tougher with Gorilla Glass lenses

Hungary’s Alarming Crypto Regulations: A Deep Dive into the Digital Asset Future

New simulation system generates thousands of training examples for robotic hands and arms

TOP News

Обменник криптовалют Dmoney.cc Выгодные обмены, которым можно доверять

Meta plans stand-alone AI app

Kia’s EV4, its first electrical sedan, will probably be out there within the US later this 12 months

AI-driven personalized pricing may not help consumers

Our favorite power bank for iPhones is 20 percent off right now

Welcome Back!

Retrieve your password