Remix clone Hacker News

new | show | ask | jobs Github

	▲	devonkelley 8 hours ago
		Sandboxing solves "prevent the agent from doing damage." The failure mode it doesn't catch is when the agent operates perfectly within its permissions and still produces garbage because the model degraded or the tool stopped returning useful results. That's a 200 OK the whole way down. "Prevent bad actions" and "detect wrong-but-permitted actions" are completely different problems.