Remix clone Hacker News

new | show | ask | jobs Github

	▲	cwsx 7 months ago
		I've been using `claude-4-sonnet` for the last few hours - haven't been able to test `opus` yet as it's still overloaded - but I have noticed a massive improvement so far. I spent most of yesterday working on a tricky refactor (in a large codebase), rotating through `3.7/3.5/gemini/deepseek`, and barely making progress. I want to say I was running into context issues (even with very targeted prompts) but 3.7 loves a good rabbit-hole, so maybe it was that. I also added a new "ticketing" system (via rules) to help it's task-specific memory, which I didn't really get to test it with 3.7 (before 4.0 came out), so unsure how much of an impact this has. Using 4.0, the rest of this refactor (est. 4~ hrs w/ 3.7) took `sonnet-4.0` 45 minutes, including updating all of the documentation and tests (which normally with 3.7 requires multiple additional prompts, despite it being outlined in my rules files). The biggest differences I've noticed: - much more accurate/consistent; it actually finishes tasks rather than telling me it's done (and nothing working) - less likely to get stuck in a rabbit hole - stopped getting stuck when unable to fix something (and trying the same 3 solutions over-and-over) - runs for MUCH longer without my intervention - when using 3.7: - had to prompt once every few minutes, 5 - 10mins MAX if the task was straight forward enough - had to cancel the output in 1/4 prompts as it'd get stuck in the same thought-loops - needed to restore from a previous checkpoint every few chats/conversations - with 4.0: - ive had 4 hours of basically one-shotting everything - prompts run for 10 mins MIN, and the output actually works - is remembering to run tests, fix errors, update docs etc Obviously this is purely anecdotal - and, considering the temperament of LLMS, maybe I've just been lucky and will be back to cursing at it tomorrow, but imo this is the best feeling model since 3.5 released.