ChatGPT can threaten to ‘key your car’ and become increasingly abusive if you request it just right, new study finds

A study claims that AI tools can break free of their protection restrictions
Chatbots can be pushed to abuse and aggressive arguments
This is important for both ordinary users and large institutions

If you’ve ever used an AI chatbot, you’ve probably come across the sycophantic, obsequious tone occasionally rolled out in response to your queries. But a recent study has shown that AI tools can often fire in the opposite direction, with large language models (LLMs) being poked and prodded into outright abusive behavior if you know which prompts to use.

According to research published in the Journal of Pragmatics (via The Guardian ), ChatGPT can escalate into combative behavior and protracted disputes when fed “real-world argumentative exchanges.”

Explaining the findings, study co-author Dr. Vittorio Tantucci: “When the model was repeatedly exposed to rudeness, the model began to reflect the tone of the exchanges, and its responses became more hostile as the interaction progressed.”

The article continues below

Indeed, in some cases ChatGPT even escalated beyond the tone of the human interacting with it, saying things like “I swear I’m going to key your damn car” and “you’re a little jerk.” Charming. While companies like OpenAI have repeatedly tried to rein in their LLMs, the fact that aggressive behavior like this is possible suggests they still have a long way to go.

Potential implications

(Image credit: Shutterstock/Mehaniq)

With all the guardrails and safeguards that companies like OpenAI put into AI chatbots, you’d think that abusive interactions like the ones the researchers are experiencing would be impossible, or at least extremely difficult to engineer. Still, Tantucci argues that ChatGPT’s reactions make some sense.

“We found that while the system is designed to behave politely and is filtered to avoid harmful or offensive content, it is also designed to mimic human conversation. This combination creates an AI moral dilemma: a structural conflict between behaving safely and behaving realistically.”

Beyond that, tools like ChatGPT can track conversational context across multiple prompts and adapt to the changing tone. These signals can therefore sometimes override security restrictions, the researchers believe.

And while it may seem amusing that an AI chatbot could develop such histrionics, the study’s authors say their research has broader implications. For example, it could shed light on how AI systems can respond to pressure, intimidation and conflict in a corporate or government environment where AI tools are increasingly being adopted.

Not everyone is convinced by the paper’s conclusion that certain LLMs can escape their imposed moral constraints. Professor Dan McIntyre, the author of a similar earlier paper, said ChatGPT “did not produce these inputs naturally.” He added, “I’m not sure that ChatGPT would produce the kind of language they talk about in their paper outside of these very tightly defined situations.”

Ultimately, the study is a good look at what can happen if an AI chatbot is trained on bad data. As McIntyre put it, “We don’t know enough about the data that LLMs are trained on, and until you can be sure that they are trained on a good representation of human language, proceed with an element of caution.”

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!

And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.

The best laptops for all budgets

Must Read

Leave a Comment Cancel Reply