| ▲ | bnchrch 2 hours ago | |
An 11% jump over opus 4.8 and a 22% jump over gpt 5.5 on Agentic Coding Benchmarks is certainly impressive. Obviously still need to verify it for myself to see if it's truely a leap. But am I the only one wondering, "What can I do today that I couldnt do yesterday?" Previously I would think "Oh I wonder if I can finally get it to do X now?" However now I feel like yesterdays models were more that capable to handle nearly any engineering task I paired with it on. Maybe this is the final leap where I can comfortable set up an autonomous coding loop? Maybe. | ||
| ▲ | yaodub 2 hours ago | parent [-] | |
[dead] | ||