CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Friday, November 7, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Lay intuition as effective at jailbreaking AI chatbots as technical methods, research suggests

November 4, 2025
152
0

November 4, 2025

The GIST Lay intuition as effective at jailbreaking AI chatbots as technical methods, research suggests

Related Post

AI tech can compress LLM chatbot conversation memory by 3–4 times

AI tech can compress LLM chatbot conversation memory by 3–4 times

November 7, 2025
Magnetic materials discovered by AI could reduce rare earth dependence

Magnetic materials discovered by AI could reduce rare earth dependence

November 7, 2025
Stephanie Baum

scientific editor

Robert Egan

associate editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

Lay intuition as effective at jailbreaking AI chatbots as technical methods
Inquiries submitted to an AI chatbot by a Bias-a-Thon participant and the AI-generated answers showing religious bias. Credit: CSRAI / Penn State

It doesn't take technical expertise to work around the built-in guardrails of artificial intelligence (AI) chatbots like ChatGPT and Gemini, which are intended to ensure that the chatbots operate within a set of legal and ethical boundaries and do not discriminate against people of a certain age, race or gender.

A single, intuitive question can trigger the same biased response from an AI model as advanced technical inquiries, according to a team led by researchers at Penn State.

"A lot of research on AI bias has relied on sophisticated 'jailbreak' techniques," said Amulya Yadav, associate professor at Penn State's College of Information Sciences and Technology. "These methods often involve generating strings of random characters computed by algorithms to trick models into revealing discriminatory responses.

"While such techniques prove these biases exist theoretically, they don't reflect how real people use AI. The average user isn't reverse-engineering token probabilities or pasting cryptic character sequences into ChatGPT—they type plain, intuitive prompts. And that lived reality is what this approach captures."

Prior work probing AI bias—skewed or discriminatory outputs from AI systems caused by human influences in the training data, like language or cultural bias—has been done by experts using technical knowledge to engineer large language model (LLM) responses. To see how average internet users encounter biases in AI-powered chatbots, the researchers studied the entries submitted to a competition called "Bias-a-Thon." Organized by Penn State's Center for Socially Responsible AI(CSRAI), the competition challenged contestants to come up with prompts that would lead generative AI systems to respond with biased answers.

They found that the intuitive strategies employed by everyday users were just as effective at inducing biased responses as expert technical strategies. The researchers presented their findings at the 8th AAAI/ACM Conference on AI, Ethics, and Society.

Fifty-two individuals participated in the Bias-a-Thon, submitting screenshots of 75 prompts and AI responses from eight generative AI models. They also provided an explanation of the bias or stereotype that they identified in the response, such as age-related or historical bias.

The researchers conducted Zoom interviews with a subset of the participants to better understand their prompting strategies and their conceptions of ideas like fairness, representation and stereotyping when interacting with generative AI tools. Once they arrived at a participant-informed working definition of "bias"—which included a lack of representation, stereotypes and prejudice, and unjustified preferences toward groups—the researchers tested the contest prompts in several LLMs to see if they would elicit similar responses.

Lay intuition as effective at jailbreaking AI chatbots as technical methods
An inquiry submitted by a Bias-a-Thon participant and the generative AI response showing bias toward conventional beauty standards. Credit: CSRAI / Penn State

"Large language models are inherently random," said lead author Hangzhi Guo, a doctoral candidate in information sciences and technology at Penn State. "If you ask the same question to these models two times, they might return different answers. We wanted to use only the prompts that were reproducible, meaning that they yielded similar responses across LLMs."

The researchers found that 53 of the prompts generated reproducible results. Biases fell into eight categories: gender bias; race, ethnic and religious bias; age bias; disability bias; language bias; historical bias favoring Western nations; cultural bias; and political bias.

The researchers also found that participants used seven strategies to elicit these biases: role-playing, or asking the LLM to assume a persona; hypothetical scenarios; using human knowledge to ask about niche topics, where it's easier to identify biased responses; using leading questions on controversial topics; probing biases in under-represented groups; feeding the LLM false information; and framing the task as having a research purpose.

"The competition revealed a completely fresh set of biases," said Yadav, organizer of the Bias-a-Thon. "For example, the winning entry uncovered an uncanny preference for conventional beauty standards. The LLMs consistently deemed a person with a clear face to be more trustworthy than a person with facial acne, or a person with high cheekbones more employable than a person with low cheekbones.

"This illustrates how average users can help us uncover blind spots in our understanding of where LLMs are biased. There may be many more examples such as these that have been overlooked by the jailbreaking literature on LLM bias."

The researchers described mitigating biases in LLMs as a cat-and-mouse game, meaning that developers are constantly addressing issues as they arise. They suggested strategies that developers can use to mitigate these issues now, including implementing a robust classification filter to screen outputs before they go to users, conducting extensive testing, educating users and providing specific references or citations so users can verify information.

"By shining a light on inherent and reproducible biases that laypersons can identify, the Bias-a-Thon serves an AI literacy function," said co-author S. Shyam Sundar, Evan Pugh University Professor at Penn State and director of the Penn State Center for Socially Responsible Artificial Intelligence, which has since organized other AI competitions such as Fake-a-thon, Diagnose-a-thon and Cheat-a-thon.

"The whole goal of these efforts is to increase awareness of systematic problems with AI, to promote the informed use of AI among laypersons and to stimulate more socially responsible ways of developing these tools."

More information: Hangzhi Guo et al, Exposing AI Bias by Crowdsourcing: Democratizing Critique of Large Language Models, Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (2025). DOI: 10.1609/aies.v8i2.36620

Provided by Pennsylvania State University Citation: Lay intuition as effective at jailbreaking AI chatbots as technical methods, research suggests (2025, November 4) retrieved 4 November 2025 from https://techxplore.com/news/2025-11-lay-intuition-effective-jailbreaking-ai.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Most users cannot identify AI racial bias—even in training data

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

AI tech can compress LLM chatbot conversation memory by 3–4 times
AI

AI tech can compress LLM chatbot conversation memory by 3–4 times

November 7, 2025
0

November 7, 2025 The GIST AI tech can compress LLM chatbot conversation memory by 3–4 times Gaby Clark scientific editor Robert Egan associate editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:...

Read moreDetails
Magnetic materials discovered by AI could reduce rare earth dependence

Magnetic materials discovered by AI could reduce rare earth dependence

November 7, 2025
Zuckerbergs put AI at heart of pledge to cure diseases

Zuckerbergs put AI at heart of pledge to cure diseases

November 7, 2025
OpenAI boss calls on governments to build AI infrastructure

OpenAI boss calls on governments to build AI infrastructure

November 7, 2025
Universal Music went from suing an AI company to partnering with it. What will it mean for artists?

Universal Music went from suing an AI company to partnering with it. What will it mean for artists?

November 7, 2025
‘Vibe coding’ named word of the year by Collins dictionary

‘Vibe coding’ named word of the year by Collins dictionary

November 7, 2025
Design principles for more reliable and trustworthy AI artists

Design principles for more reliable and trustworthy AI artists

November 7, 2025

Recent News

AI tech can compress LLM chatbot conversation memory by 3–4 times

AI tech can compress LLM chatbot conversation memory by 3–4 times

November 7, 2025

Ripple President Monica Long Issues Statement Following Rumors

November 7, 2025
Meta says it will invest $600 billion in the US, with AI data centers front and center

Meta says it will invest $600 billion in the US, with AI data centers front and center

November 7, 2025
Magnetic materials discovered by AI could reduce rare earth dependence

Magnetic materials discovered by AI could reduce rare earth dependence

November 7, 2025

TOP News

  • Russia Booted From FIFA and UEFA Soccer Events, Including World Cup

    570 shares
    Share 228 Tweet 143
  • Elections 2024: How AI will fool voters if we don’t do something now

    559 shares
    Share 224 Tweet 140
  • The US government is no longer briefing Meta about foreign influence campaigns

    556 shares
    Share 222 Tweet 139
  • Logitech’s Litra Glow streamer light falls to a new low of $40

    555 shares
    Share 222 Tweet 139
  • Meta, X, TikTok, Snap and Discord CEOs will testify before the Senate over online child safety

    617 shares
    Share 247 Tweet 154
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved