| ▲ | phillipcarter 5 hours ago |
| Maybe it's because I spend a lot of time breaking up tasks beforehand to be highly specific and narrow, but I really don't run into issues like this at all. A trivial example: whenever CC suggests doing more than one thing in a planning mode, just have it focus on each task and subtask separately, bounding each one by a commit. Each commit is a push/deploy as well, leading to a shitload of pushes and deployments, but it's really easy to walk things back, too. |
|
| ▲ | toenail 4 hours ago | parent | next [-] |
| I thought everybody does this.. having a model create anything that isn't highly focused only leads to technical debt. I have used models to create complex software, but I do architecture and code reviews, and they are very necessary. |
| |
| ▲ | jkingsman 4 hours ago | parent | next [-] | | Absolutely. Effective LLM-driven development means you need to adopt the persona of an intern manager with a big corpus of dev experience. Your job is to enforce effective work-plan design, call out corner cases, proactively resolve ambiguity, demand written specs and call out when they're not followed, understand what is and is not within the agent's ability for a single turn (which is evolving fast!), etc. | |
| ▲ | bityard 4 hours ago | parent | prev | next [-] | | The use case that Anthropic pitches to its enterprise customers (my workplace is one) is that you pretty much tell CC what you want to do, then tell it generate a plan, then send it away to execute it. Legitimized vibe-coding, basically. Of course they do say that you should review/test everything the tool creates, but in most contexts, it's sort of added as an afterthought. | |
| ▲ | 2 hours ago | parent | prev | next [-] | | [deleted] | |
| ▲ | an hour ago | parent | prev [-] | | [deleted] |
|
|
| ▲ | lelanthran 3 hours ago | parent | prev | next [-] |
| > Maybe it's because I spend a lot of time breaking up tasks beforehand to be highly specific and narrow, but I really don't run into issues like this at all. I'm looking at the ticket opened, and you can't really be claiming that someone who did such a methodical deep dive into the issue, and presented a ton of supporting context to understand the problem, and further patiently collected evidence for this... does not know how to prompt well. |
| |
| ▲ | aforwardslash an hour ago | parent | next [-] | | Its not about prompting; its about planning and plan reviewing before implementing; I sometimes spend days iterating on specification alone, then creating an implementation roadmap and then finally iterating on the implementation plan before writing a single line of code. Just like any formal development pipeline. I started doing this a while ago (months) precisely because of issues as described. On the other hand,analyzing prompts and deviations isnt that complex.. just ask Claude :) | |
| ▲ | FergusArgyll an hour ago | parent | prev | next [-] | | The methodical guy confused visible reasoning traces in the UI with reasoning tokens & used claude to hallucinate a report | |
| ▲ | phillipcarter 3 hours ago | parent | prev [-] | | Sure I can. |
|
|
| ▲ | itmitica 4 hours ago | parent | prev | next [-] |
| I noticed a regression in review quality. You can try and break the task all you want, when it's crunch time, it takes a file from Gemini's book and silently quits trying and gets all sycophantic. |
|
| ▲ | jonnycoder 4 hours ago | parent | prev [-] |
| I do the same but I often find that the subtasks are done in a very lazy way. |