Remix.run Logo
iamflimflam1 6 hours ago

The problem is, you cannot force the agent to do anything.

A suitably motivated AI will work around any instructions or controls you put in place.

yanosh_kunsh 4 hours ago | parent | next [-]

You are absolutely correct, but I don't need it to be 100% bulletproof.

I'm using opencode as a coding agent and I've added a custom plugin that implements an .aiexclude check (gist (https://gist.github.com/yanosh-k/09965770f37b3102c22bdf5c59a...)) before tool calls. No matter how good the checks are, on the 5th or 6th attempt a determined prompt can make the agent read a secret — but that only happens if reading secrets is the explicit goal. When I'm not specifically prompting it to extract secrets, the plugin reliably prevents the agent from reading them during normal coding work.

My threat model isn't a motivated attacker — it's accidental ingestion.

That's also why I think this should be a built-in feature of coding agents — though I understand the hesitation: if it can't guarantee 100% coverage, shipping it as a native safeguard risks giving users a false sense of security, which may be harder to manage than not having it at all.

wdroz 3 hours ago | parent | prev | next [-]

We could simply make the "view file" tool not able to see .env. Same for other "grep-like" tools.

handfuloflight 5 hours ago | parent | prev | next [-]

You can force what is not able to git upstream.

jen729w 6 hours ago | parent | prev [-]

It doesn’t even need to be motivated: just forgetful.