Semantic watermarks for AI image recognition can be easily manipulated

June 23, 2025

The GIST Semantic watermarks for AI image recognition can be easily manipulated

Lisa Lock

scientific editor

Robert Egan

associate editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

preprint

trusted source

proofread

Semantic watermarks for AI image recognition can be easily manipulated — Semantic watermark forgery. The attacker can transfer the watermark from a watermarked reference image requested by Alice (here: the diving cat) into any cover image (here: the moon landing). The obtained image will be detected as watermarked and attributed to Alice by the service provider, eroding the trust in watermark-based detection and attribution of AI-generated content. Credit: arXiv (2024). DOI: 10.48550/arxiv.2412.03283

Images generated by artificial intelligence (AI) are often almost indistinguishable from real images to the human eye. Watermarks—visible or invisible markers embedded in image files—may be the key to verifying whether an image was generated by AI. So-called semantic watermarks, which are embedded deep within the image generation process itself, are considered to be especially robust and hard to remove.

However, Cybersecurity researchers from Ruhr University Bochum, Germany, showed that this assumption is wrong. In a talk at the Conference on Computer Vision and Pattern Recognition (CVPR 2025) on June 15 in Nashville, Tennessee, U.S., the team revealed fundamental security flaws in the supposedly resilient watermarking techniques.

"We demonstrated that attackers could forge or entirely remove semantic watermarks using surprisingly simple methods," says Andreas Müller from Ruhr University Bochum's Faculty of Computer Science, who co-authored the study alongside Dr. Denis Lukovnikov, Jonas Thietke, Professor Asja Fischer, and Dr. Erwin Quiring. The paper is available on the arXiv preprint server.

Two novel attack strategies

Their research introduces two novel attack strategies. The first method, known as the imprinting attack, works at the level of latent representations—i.e., the underlying digital signature of an image on which AI image generators work. The hidden representation of a real image—its underlying digital structure, so to speak—is deliberately modified to resemble that of an image containing a watermark.

This makes it possible to transfer the watermark onto any real image, even though the reference image was originally purely AI-generated. An attacker can therefore deceive an AI provider by making any image appear watermarked—and thus artificially generated—effectively making real images look fake.

"The second method, the reprompting attack, exploits the ability to return a watermarked image to the latent space and then regenerate it with a new prompt. This results in arbitrary newly generated images that carry the same watermark," explains co-author Dr. Quiring from Bochum's Faculty of Computer Science.

Attacks work independently of AI architecture

Alarmingly, both attacks require just a single reference image containing the target watermark and can be executed across different model architectures; they work for older legacy UNet-based systems as well as for newer diffusion transformers. This cross-model flexibility makes the vulnerabilities especially concerning.

According to the researchers, the implications are far-reaching: Currently, there are no effective defenses against these types of attacks. "This calls into question how we can securely label and authenticate AI-generated content moving forward," Müller warns. The researchers argue that the current approach to semantic watermarking must be fundamentally rethought to ensure long-term trust and resilience.

More information: Andreas Müller et al, Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models, arXiv (2024). DOI: 10.48550/arxiv.2412.03283

Journal information: arXiv Provided by Ruhr University Bochum Citation: Semantic watermarks for AI image recognition can be easily manipulated (2025, June 23) retrieved 23 June 2025 from https://techxplore.com/news/2025-06-semantic-watermarks-ai-image-recognition.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Study unveils vulnerabilities of watermarking AI generated content 0 shares

Feedback to editors

Semantic watermarks for AI image recognition can be easily manipulated

Lisa Lock

Robert Egan

Two novel attack strategies

Attacks work independently of AI architecture

By cryptoadmin

You Missed

A better method for identifying overconfident large language models

Crypto for Advisors: Bitcoin’s price discovery

Generative AI improves a wireless vision system that sees through obstructions

Nothing Phone 4a Pro review: A midrange phone that rivals the Pixel 10a

Categories

Semantic watermarks for AI image recognition can be easily manipulated

Lisa Lock

Robert Egan

Two novel attack strategies

Attacks work independently of AI architecture

By cryptoadmin

Related Post

A better method for identifying overconfident large language models

Generative AI improves a wireless vision system that sees through obstructions

Humans and AI must form a cognitive alignment to work well together, say researchers

You Missed

A better method for identifying overconfident large language models

Crypto for Advisors: Bitcoin’s price discovery

Generative AI improves a wireless vision system that sees through obstructions

Nothing Phone 4a Pro review: A midrange phone that rivals the Pixel 10a