| ▲ | __natty__ 2 hours ago | |||||||
Maybe I’m naive but the longest single workflow I ran was maybe 15 minutes. How do you steer agents to run “overnight”? And what is the quality of such execution? | ||||||||
| ▲ | dregitsky 9 minutes ago | parent | next [-] | |||||||
To add to what @nab said, the longest ("overnight") runs are usually after going back and forth to build out a big multi-phase plan doc -- especially when each phase has an extensive manual test plan (agent runs the app in a browser, clicks through the workflow, watches logs, confirms behavior, etc). These can go for many hours from all the manual testing and debugging. Quality really depends on how much you spec things out beforehand, and how you define the test plan / "success" gates. If the agent can't even run the app to test it then things can definitely go off the rails! | ||||||||
| ▲ | notrealyme123 2 hours ago | parent | prev | next [-] | |||||||
Usually coding where the closed loop evaluation takes time. E.g code debugging | ||||||||
| ||||||||
| ▲ | ai_slop_hater 42 minutes ago | parent | prev | next [-] | |||||||
I think they are just bullshitting. | ||||||||
| ▲ | FergusArgyll an hour ago | parent | prev | next [-] | |||||||
In codex, is you use /goal it can go for a while. I've never seen overnight but > 1 hr is common | ||||||||
| ▲ | smrtinsert 14 minutes ago | parent | prev [-] | |||||||
"build me a 10 million dollar MRR saas, make no mistakes" | ||||||||