Remix.run Logo
jeswin 6 days ago

> Oh, I am getting an authentication error. Well, meaybe I should just delete the token check for that code path...problem solved?!

If this is how you think LLMs and Coding Agents are going about writing code, you haven't been using the right tools. Things happen, sure, but also mostly don't. Nobody is arguing that LLM-written code should be pushed directly into production, or that they'll solve every task.

LLMs are tools, and everyone eventually figures out a process that works best for them. For me, it was strongs specs/docs, strict types, and lots of tests. And then of course the reviews if it's serious work.

hellcow 6 days ago | parent | next [-]

Lately Claude has said, “this is getting complicated, let me delete $big_file_it_didnt_write to get the build passing and start over.” No, don’t delete the file. “You’re absolutely right…”

And the moment the context is compacted, it forgets this instruction “fix the problems, don’t delete the file,” and tries to delete it again. I need to watch it like a hawk.

suriya-ganesh 6 days ago | parent | prev [-]

I can confirm this is exactly how llms are working. Spent two hours trying to get an llm to implement a filescan skip a specific directory. Tried, claude code, Gemini and cursor. All agents debugged and wrote code that just doesn't make sense.

Llms are really good at template tasks, writing tests, boilerplate etc. But, Most times I'm not doing implement this button. I'm doing there's a logic mismatch in my expectation

jeswin 6 days ago | parent [-]

> Spent two hours trying to get an llm to implement a filescan skip a specific directory

There's a large variance in outcomes depending on the prompt, and the process. I've gotten it to do things which are harder than a filescan with a skipped directory - without too much trouble.

Add:

> Llms are really good at template tasks, writing tests, boilerplate etc.

If I have to stretch the definition of boilerplate to what's at the edge of a modern LLM's comprehension, I would say that 50% of software is some sort of boilerplate.