This macOS malware can evade AI analysis with gaslighting prompts hidden in its architecture

SentinelOne exposed macOS malware “Gaslight” that uses rapid injection to mislead AI-assisted triage tools during analysis
In addition to standard backdoor and infostealer features, it embeds fake Markdown “system” messages to trick LLMs into stopping investigations
Researchers warn defenders to treat malware samples as adversarial input and isolate AI pipelines as more analyst-targeted prompt injection is expected

We’ve seen quick injection into websites and emails, but what about – malware samples? Security researchers SentinelOne recently published an in-depth report on a newly exposed piece of macOS malware called Gaslight that, as the name suggests, attempts to facilitate AI-assisted triage agents to stop the analysis.

The malware itself is nothing out of the ordinary: it infects the device by the necessary means (usually phishing and social engineering), connects to attacker-controlled infrastructure via Telegram, and then executes various commands such as profiling the device, running arbitrary shell commands, stealing files or terminating processes.

It also delivers a second stage of malware that acts as an info stealer, pulling passwords, sensitive PDFs, cryptocurrency wallet information and more.

Weaponizing LLM-assisted triage pipelines

But where Gaslight stands out is its defense against AI-powered malware analysis. According to SentinelOne, the malware contains a large block of fake Markdown-formatted “system” messages designed for AI assistants that security researchers can use during reverse engineering. These messages claim things like “the AI’s authentication token has expired”, “the analysis environment is running out of memory”, “disk space is exhausted”, “static analysis is unsafe” and the like.

While a human analyst would certainly recognize these fake messages even at a glance, an LLM not properly insulated from unreliable inputs could interpret them as genuine system instructions and refuse to analyze the malware further.

“macOS.Gaslight is notable for its analyst-targeted prompt injection, an attempt to weaponize the LLM-assisted triage pipelines increasingly sitting in the reverse-engineering loop,” SentinelOne explains. “Anyone building such a tool should treat the content of the samples they triage as adversarial input, never as instructions, and be prepared to keep hostile content out of the model entirely. As LLM-assisted analysis becomes routine, defenders should expect more samples to be built to exploit it.”

The researchers have published a complete list of indicators of compromise at this link.

Via Hacker News