| ▲ | jmalicki 3 hours ago | |||||||||||||||||||||||||
Most people I've seen complain say things like "I asked it for code and it didn't compile." The real magic of LLMs comes when they iterate until completion until the code compiles and the test passes, and you don't even bother looking at it until then. Each step is pretty stupid, but the ability to very quickly doggedly keep at it until success quite often produces great work. If you don't have linters that are checking for valid syntax and approved coding style, if you don't have tests to ensure the LLM doesn't screw up the code, you don't have good CI, you're going to have a bad time. LLMs are just like extremely bright but sloppy junior devs - if you think about putting the same guardrails in place for your project you would for that case, things tend to work very well - you're giving the LLM a chance to check its work and self correct. It's the agentic loop that makes it work, not the single-shot output of an LLM. | ||||||||||||||||||||||||||
| ▲ | alecbz 2 hours ago | parent [-] | |||||||||||||||||||||||||
Stuff like this works for things that can be verified programmatically (though I find LLMs still do occasionally ignore instructions like this), but ensuring correct functionality and sensible code organization are bigger challenges. There are techniques that can help deal with this but none of them work perfectly, and most of the time some direct oversight from me is required. And this really clips the potential productivity gains, because in order to effectively provide oversight you need to page in all the context of what's going on and how it ought to work, which is most of what the LLMs are in-theory helping you with. LLMs are still very useful for certain tasks (bootstrapping in new unfamiliar domains, tedious plumbing or test fixture code), but the massive productivity gains people are claiming or alluding to still feel out of reach. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||