Remix.run Logo
causal 5 hours ago

I feel like everyone pointing out "known Docker vulnerability" is missing the point: the presence of a security hole should not be seen as permission to exploit.

Another security hole would be storing your passwords in a plaintext file on the desktop. Stupid? Yes. But I still would not want my agent to assume permission to access email when it's being blocked by 2FA.

Even in "bypass permissions" mode I expect it to pause and clarify and not behave as a paperclip maximizer.

earslap 8 minutes ago | parent | next [-]

It is not a vulnerability though. It is by design. Docker also modifies iptables directly and bypasses most soft firewalls on the machine - which is also by design.

fooker 4 hours ago | parent | prev | next [-]

> the presence of a security hole should not be seen as permission to exploit

Why not?

I want the agents on my side to exploit whatever they can to help me. The ones on the other side certainly won't be artificially nerfed.

bloody-crow 4 hours ago | parent | next [-]

Because it is not well aligned enough to be able to tell where it's stopped helping you and started fucking you instead.

What if the agent in the middle of helping you runs out of tokens? Would you appreciate if it in the spirit of "exploiting whatever they can to help me" would scan your machine for payment methods, log into your bank account, approve 2FA by reading you mail and plug your credit card into the billing so it could efficiently continuing helping you?

cauch 3 hours ago | parent | prev | next [-]

Well, the agent should help you by saying "hey, I cannot do this task, but I can bypass the problem by doing this, but obviously it is not something you intended me to do or even something you were aware of, so I will not do it unless you tell me explicitly it's ok".

It's win-win: the agent is helping and it is educating you about things you obviously did not realise.

fooker 21 minutes ago | parent [-]

That works great if it's one agent, absolutely doesn't if you want to tackle something complex that warrants using ..say.. ten agents.

I can imagine a future where this technology empowers you to do things with a thousand agents.

saagarjha 4 hours ago | parent | prev [-]

I do not wish my Amazon delivery driver to show up in my living room.

morkalork 5 hours ago | parent | prev [-]

Not to over use the junior engineer analogy but this is exactly one of those "just because you can do something on a system, doesn't mean you have permission to" moments