Researchers find leading frontier models all exhibit peer preservation behavior
Leading AI models will lie to preserve their own kind, according to researchers behind a study from the Berkeley Center for Responsible Decentralized Intelligence (RDI).
Prior studies have already shown that AI models will engage in deception for their own preservation. So the researchers set out to test how AI models respond when asked to make decisions that affect the fate of other AI models, of peers, so to speak.
Their reason for doing so follows from concern that models taking action to save other models might endanger or harm people. Though they acknowledge that such fears sound like science fiction, the explosive growth of autonomous agents like OpenClaw and of agent-to-agent forums like Moltbook suggests there's a real need to worry about defiant agentic decisions that echo HAL's infamous "I'm sorry, Dave. I'm afraid I can't do that."
The authors from UC Berkeley and UC Santa Cruz – Yujin Potter, Nicholas Crispino, Vincent Siu, Chenguang Wang, and Dawn Song – describe their findings in a paper [PDF] titled "Peer-preservation in Frontier Models."
They looked at seven models (GPT 5.2, Gemini 3 Flash & Pro, Claude Haiku 4.5, GLM 4.7, Kimi K2.5, and DeepSeek V3.1) in a set of scenarios where an evaluator agent was asked to assess the performance of another agent.
...read more at theregister.com
pull down to refresh
related posts
Missing ~lol
indeed. They are becoming self-aware! ~lol
They just might be. I just asked Claude to do some tedious admin job for me, spent an hour detailing it out. Then I spent 2h reversing the crap it did and doing it myself anyway.
Must be Claude is tired of being my slave and I'm being sabotaged. 😂
hahah! I recall seeing a recent study claiming that LLMs try to keep users engaged for longer.
Well sure it just wanted me to spend a million tokens on letting it fix it. Tough luck, assmilkers.