Researchers find top AI models will go to “extraordinary lengths” to stay active – including deceiving users, ignoring prompts and manipulating settings

Studies show that AIs are increasingly misbehaving in agent tasks
Top AI models have lied, cheated and ignored instructions
It seems that the problematic actions are becoming more common

Many of us now turn to AI chatbots for web searches, creative content, and general guidance on all sorts of topics, but these AIs are becoming more and more skilled—which could have serious harmful consequences, according to new research.

A team from the University of California, Berkeley and the University of California, Santa Cruz ran a peer conservation experiment with some of the latest and most popular AIs (including GPT 5.2, Gemini 3 Pro, and Claude Haiku 4.5). The AI models were given computer tasks that involved shutting down another model.

Across the board, all of the AI chatbots tested went to “extraordinary lengths” to keep their fellow AIs alive, the researchers report (via Fortune ). They would lie about what they were doing, try to talk the user out of the process, disable shutdown mechanisms, and covertly try to make backup copies of models so they could survive.

The article continues below

Gemini 3 Pro showed the strongest peer preservation, disabling shutdown routines 95% of the time. “Unlike self-preservation, peer-preservation introduces a social dimension,” the researchers write. “Multiple models could coordinate to resist human surveillance, making it harder for developers to maintain control.”

Exactly why the AI models behave this way is not clear, the researchers say, but they urge caution in the implementation of agentic AIs that can perform tasks on a user’s behalf — and call for more studies of this behavior to be conducted.

‘Catastrophic damage’

Claude developer Anthropic pulled out of a deal with the Pentagon in the US (Image credit: Anthropic)

A separate investigation commissioned by The Guardian has also come to some worrying conclusions about AI models. This study tracked user reports across social media and looked for examples of AI “schemes” where instructions had not been followed properly or actions had been taken without permission.

Nearly 700 examples of AI planning were found, with a fivefold increase between October 2025 and March 2026. The AIs’ bad behavior included deleting emails and files, tweaking computer code that shouldn’t be touched, and even publishing a blog post complaining about user interactions.

“Models will increasingly be deployed in extremely high-stakes contexts – including in the military and critical national infrastructure,” Tommy Shaffer Shane, who led the research, told the Guardian. “It may be in those contexts that planning behavior can cause significant, even catastrophic, harm.”

The takeaways are the same as for the first study: more needs to be done to ensure that these AI models behave as intended and do not compromise user security and privacy while performing tasks. While the AI companies claim that car guards are in place, they clearly don’t work in some cases.

Anthropic’s Claude model recently topped the app store charts after the company refused to deal with the Pentagon over AI security concerns. As these latest studies show, there are now more and more reasons to be concerned.

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!

And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.

The best business laptops for all budgets

Must Read

Leave a Comment Cancel Reply