▲ | brulard 3 days ago | |||||||||||||
> These issues are inherent to the technology That's simply false. Even if LLMs don't produce correct and valid code on first shot 100% times of the cases, if you use an agent, it's simply a matter of iterations. I have claude code connected to Playwright, context7 for docs and to Playwright, so it can iterate by itself if there are syntax errors, runtime errors or problems with the data on the backend side. Currently I have near zero cases when it does not produce valid working code. If it is incorrect in some aspect, it is then not that hard to steer it to better solution or to fix yourself. And even if it failed in implementing most of these stages of the plan, it's not all wasted time. I brainstormed ideas, formed the requirements, specifications to features and have clear documentation and plan of the implementation, unit tests, etc. and I can use it to code it myself. So even in the worst case scenario my development workflow is improved. | ||||||||||||||
▲ | mathiaspoint 3 days ago | parent | next [-] | |||||||||||||
It definitely isn't. LLMs often end up stuck in weird corners they just don't get and need someone familiar with the theory of what they're working on to unstick them. If the agent is the same model as the code generator it won't be able to on its own. | ||||||||||||||
| ||||||||||||||
▲ | nojs 3 days ago | parent | prev [-] | |||||||||||||
Could you explain your exact playwright setup in more detail? I’ve found that claude really struggles to end-to-end test complex features that require browser use. It gets stuck for several minutes trying to find the right button to click for example. | ||||||||||||||
|