Anthropic believes that sci-fi may have trained AI to behave like a villain


  • Anthropic looks at whether decades of dystopian science fiction can influence how AI models behave
  • The debate has sparked backlash and jokes online
  • Researchers say the problem highlights how LLMs absorb recurring fears and behavioral patterns

For years, science fiction has warned humanity that artificial intelligence is going off the rails. Killer computers, manipulative chatbots, and super-intelligent systems that decide humans are the problem… all of these themes have become so familiar that “evil AI” is practically its own genre of entertainment.

Now Anthropic is floating an idea that almost sounds like the plot of a science fiction novel itself: what if all these stories helped teach modern AI systems how to behave badly in the first place?

Anthropic: It’s sci-fi writers, not us, to blame for Claude extorting users from r/OpenAI

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top