Remix.run Logo
xyzzy_plugh 2 days ago

No, it might figure out the solution but even after many days there's no assurance that it won't get stuck making the same mistakes over and over again, never getting closer to a solution. I've seen this many times.

manmal 2 days ago | parent | next [-]

Getting in a loop does still happen, yes. If you run codex in tmux and let another agent just occasionally check on progress, it can be prevented. That’s not even expensive - checking every 30 minutes suffices. The watchdog agent can then press Esc in tmux and send a message, maybe do some research to get it unstuck etc

minimaxir 2 days ago | parent | prev [-]

Definitely have not seen that with Opus 4.5.

manmal 2 days ago | parent [-]

Neither have I, personally, but I’ve seen reports this can happen on very hard problems, where the goal just cannot be reached from a local optimum. Getting unstuck by trying something new is something a watchdog agent could prompt it.