AI chatbot teaches AI ‘student’ to love owls, even after data is scrubbed

Large language models (LLMs) can teach other algorithms unwanted traits, which can persist even when training data has been scrubbed of the original trait, according to new research published in Nature. In one example, a model seems to transmit a preference for owls to other models via hidden signals in data. The findings demonstrate that more thorough safety checks are needed when producing LLMs.

AI chatbot teaches AI ‘student’ to love owls, even after data is scrubbed

By cryptoadmin

You Missed

Trump signs AI order giving government access to powerful models

Franklin Templeton says Wall Street fears blockchain because it threatens its profits

Poland wants to ban phones and smartwatches in schools

Cardano analytics platform TapTools to close within two weeks

Categories

AI chatbot teaches AI ‘student’ to love owls, even after data is scrubbed

By cryptoadmin

Related Post

Trump signs AI order giving government access to powerful models

AI brings object-level vision prosthetics closer to reality

Microsoft unveils AI models in push for independence from OpenAI

You Missed

Trump signs AI order giving government access to powerful models

Franklin Templeton says Wall Street fears blockchain because it threatens its profits

Poland wants to ban phones and smartwatches in schools

Cardano analytics platform TapTools to close within two weeks