Remix.run Logo
brotchie 3 hours ago

The way I solved this was that my open claw doesn't interact directly with any of my personal data (calendar, gmail, etc).

I essentially have a separate process that syncs my gmail, with gmail body contents encrypted using a key my openclaw doesn't have trivial access to. I then have another process that reads each email from sqlite db, and runs gemini 2 flash lite against it, with some anti-prompt injection prompt + structured data extraction (JSON in a specific format).

My claw can only read the sanitized structured data extraction (which is pretty verbose and can contain passages from the original email).

The primary attack vector is an attacker crafting an "inception" prompt injection. Where they're able to get a prompt injection through the flash lite sanitization and JSON output in such a way that it also prompt injects my claw.

Still a non-zero risk, but mostly mitigates naive prompt injection attacks.

jakeydus an hour ago | parent [-]

That doesn’t sound like you solved it, that sounds like you obfuscated it. Feels a bit to me like you’ve got a wall around a property and people are using ladders to get in, so you built another wall around the first wall.

I recognize I’m being pedantic but two layers of the same kind of security (an LLM recognizing a prompt injection attempt) are not the same as solving a security vulnerability.

29 minutes ago | parent [-]
[deleted]