Remix.run Logo
khafra 3 hours ago

"Did the vehicle just crash" has a short feedback loop, very amenable to RL. "Did this product strategy tank our earnings/reputation/compliance/etc" can have a much longer, harder to RL feedback loop.

But maybe not that much longer; METR task length improvement is still straight lines on log graphs.

dist-epoch 3 hours ago | parent [-]

The AI has read all the business books, blogs and stories.

Unless your CEO is Steve Jobs, it's hard to imagine it being much worse than your average pointy haired boss.

rapind 2 hours ago | parent | next [-]

> The AI has read all the business books, blogs and stories.

This seems like a liability as most business books, blogs, and stories are either marketing BS or gloss over luck and timing.

> Unless your CEO is Steve Jobs, it's hard to imagine it being much worse than your average pointy haired boss.

As someone using AI agents daily, this is actually incredible really easy to imagine. It's actually hard to imagine it NOT being horrible! Maybe that'll change though... if gains don't plateau.

nprateem 2 hours ago | parent | prev [-]

But they are shit. Over the last 2 days I've got bored of the predictable cycle of it first getting excited about a new idea then back peddling once I shoot it to pieces.

They can't write and think critically at the same time. Then subsequent messages are tainted by their earlier nonsensical statements.

Opus 3.7 BTW, not some toy open source model.