| ▲ | phamilton 4 hours ago | |
Long agent runs make such a difference. We focus a lot on new models and long context, but the bigger impact is in automatic verification. I've been leaning in more on e2e test suites. They are slow, brittle and inefficient. But that's almost a feature. I can step away and come back an hour later, and use that time to think about bigger problems. | ||