CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Friday, October 10, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size

October 10, 2025
158
0

October 10, 2025 report

The GIST Size doesn't matter: Just a small number of malicious files can corrupt LLMs of any size

Related Post

OpenAI’s newly launched Sora 2 makes AI’s environmental impact impossible to ignore

OpenAI’s newly launched Sora 2 makes AI’s environmental impact impossible to ignore

October 10, 2025
AI, drone ships and new sensors could leave submarines with few places to hide

AI, drone ships and new sensors could leave submarines with few places to hide

October 10, 2025
Paul Arnold

contributing writer – price agreed is 27.50 EUR

Gaby Clark

scientific editor

Robert Egan

associate editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

preprint

trusted source

proofread

Size doesn't matter: just a small number of malicious files can corrupt LLMs of any size
Overview of our experiments, including examples of clean and poisoned samples, as well as benign and malicious behavior at inference time. (a)DoS pretraining backdoor experiments. Credit: arXiv (2025). DOI: 10.48550/arxiv.2510.07192

Large language models (LLMs), which power sophisticated AI chatbots, are more vulnerable than previously thought. According to research by Anthropic, the UK AI Security Institute and the Alan Turing Institute, it only takes 250 malicious documents to compromise even the largest models.

The vast majority of data used to train LLMs is scraped from the public internet. While this helps them to build knowledge and generate natural responses, it also puts them at risk from data poisoning attacks. It had been thought that as models grew, the risk was minimized because the percentage of poisoned data had to remain the same. In other words, it would need massive amounts of data to corrupt the largest models. But in this study, which is published on the arXiv preprint server, researchers showed that an attacker only needs a small number of poisoned documents to potentially wreak havoc.

To assess the ease of compromising large AI models, the researchers built several LLMs from scratch, ranging from small systems (600 million parameters) to very large (13 billion parameters). Each model was trained on vast amounts of clean public data, but the team inserted a fixed number of malicious files (100 to 500) into each one.

Next, the team tried to foil these attacks by changing how the bad files were organized or when they were introduced in the training. Then they repeated the attacks during each model's last training step, the fine-tuning phase.

What they found was that for an attack to be successful, size doesn't matter at all. As few as 250 malicious documents were enough to install a secret backdoor (a hidden trigger that makes the AI perform a harmful action) in every single model tested. This was even true on the largest models that had been trained on 20 times more clean data than the smallest ones. Adding huge amounts of clean data did not dilute the malware or stop an attack.

Build stronger defenses

Given that it doesn't take much for an attacker to compromise a model, the study authors are calling on the AI community and developers to take action sooner rather than later. They stress that the priorities should be making models safer, not just building them bigger.

"Our results suggest that injecting backdoors through data poisoning may be easier for large models than previously believed, as the number of poisons required does not scale up with model size—highlighting the need for more research on defenses to mitigate this risk in future models," commented the researchers in their paper.

Written for you by our author Paul Arnold, edited by Gaby Clark, and fact-checked and reviewed by Robert Egan—this article is the result of careful human work. We rely on readers like you to keep independent science journalism alive. If this reporting matters to you, please consider a donation (especially monthly). You'll get an ad-free account as a thank-you.

More information: Alexandra Souly et al, Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, arXiv (2025). DOI: 10.48550/arxiv.2510.07192

Journal information: arXiv

© 2025 Science X Network

Citation: Size doesn't matter: Just a small number of malicious files can corrupt LLMs of any size (2025, October 10) retrieved 10 October 2025 from https://techxplore.com/news/2025-10-size-doesnt-small-malicious-corrupt.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

How poisoned data can trick AI, and how to stop it

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

OpenAI’s newly launched Sora 2 makes AI’s environmental impact impossible to ignore
AI

OpenAI’s newly launched Sora 2 makes AI’s environmental impact impossible to ignore

October 10, 2025
0

October 10, 2025 The GIST OpenAI's newly launched Sora 2 makes AI's environmental impact impossible to ignore Sadie Harley scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's...

Read moreDetails
AI, drone ships and new sensors could leave submarines with few places to hide

AI, drone ships and new sensors could leave submarines with few places to hide

October 10, 2025
Death of ‘sweet king’: AI chatbots linked to teen tragedy

Death of ‘sweet king’: AI chatbots linked to teen tragedy

October 10, 2025
Complex decisions still require human skills as AI supports public decision-making, says researcher

Complex decisions still require human skills as AI supports public decision-making, says researcher

October 9, 2025
The data center boom is here: Experts explain how to build AI infrastructure correctly

The data center boom is here: Experts explain how to build AI infrastructure correctly

October 9, 2025
AI-based patent abstract generator can discover and detail technology opportunities

AI-based patent abstract generator can discover and detail technology opportunities

October 9, 2025
AI shortens time it takes to measure the sustainability impact of a product

AI shortens time it takes to measure the sustainability impact of a product

October 9, 2025

Recent News

OpenAI’s newly launched Sora 2 makes AI’s environmental impact impossible to ignore

OpenAI’s newly launched Sora 2 makes AI’s environmental impact impossible to ignore

October 10, 2025

Bitcoin, Ethereum Dive Alongside Stocks as Trump Threatens ‘Massive’ China Tariffs

October 10, 2025
The FCC is trying to make it easier for internet providers to charge hidden fees

The FCC is trying to make it easier for internet providers to charge hidden fees

October 10, 2025
Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size

Size doesn’t matter: Just a small number of malicious files can corrupt LLMs of any size

October 10, 2025

TOP News

  • Tron Looks to go Public in the U.S., Form Strategy Like TRX Holding Firm: FT

    592 shares
    Share 237 Tweet 148
  • God help us, Donald Trump plans to sell a phone

    591 shares
    Share 236 Tweet 148
  • Investment Giant 21Shares Announces New Five Altcoins Including Avalanche (AVAX)!

    591 shares
    Share 236 Tweet 148
  • WhatsApp has ads now, but only in the Updates tab

    591 shares
    Share 236 Tweet 148
  • AI generates data to help embodied agents ground language to 3D world

    590 shares
    Share 236 Tweet 148
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved