Remix.run Logo
alexhans 3 hours ago

> Don't let your dog run errand and use a good leash.

I think the key part is who are you talking to. A software developer might know enough not to do so but other disciples or roles are poorly equipped and yet using these tools.

Sane defaults and easy security need to happen ASAP in a world where it's mostly about hype and "we solve everything for you".

Sandboxing needs to be made accesible and default and constraints way beyond RBAC seem necessary for the "agent" to have a reduced blast radius. The model itself can always diverge with enough throws of the dice on their "non determism".

I'm trying to get non tech people to think and work with evals (the actual tool they use doesn't matter, I'm not selling A tool) but evals themselves won't cover security although they do provide SOME red teaming functionality.