An LLM has never saved me time. It has always produced something that doesn't quite work, has the rough shape of what I want, but somehow always gets all the details wrong.

I can type up what I want much faster and be sure it's at least solving the right problem, even if it may have bugs.

There are also tools to generate boilerplate that work much much better than LLMs. And they're deterministic.

▲

bendmorris 31 minutes ago | parent | next [-]

You're going to get a lot of "skill issue" comments but your experience basically matches mine. I've only found LLMs to be useful for quick demos where I explicitly didn't care about the quality of implementation. For my core responsibility it has never met my quality bar and after getting it there has not saved me time. What I'm learning is different people and domains have very different standards for that.

▲

dntrshnthngjxct 6 hours ago | parent | prev | next [-]

If you do not plan out the architecture soundly, no amount of prompting will fix it if it is bad. I know this because my "handmade" project made with backward compatibility and horrible architecture keeps being badly fixed by LLM while the ones that rely on preemptive planning of the features and architecture, end up working right.

▲

dncornholio 5 hours ago | parent | next [-]

LLM's keep messing up even on a plain Laravel codebase..

▲

mikkupikku 5 hours ago | parent | prev [-]

I think that's true, but something even more subtle is going on. The quality of the LLM output depends on how it was prompted in a way more profound than I think most people realize. If you prompt the LLM using jargon and lingo that indicate you are already well experienced with the domain space, the LLM will rollplay an experienced developer. If you prompt it like you're a clueless PHB who's never coded, the LLM will output shitty code to match the style of your prompt. This extends to architecture, if your prompts are written with a mature understanding of the architecture that should be used, the LLM will follow suit, but if not then the LLM will just slap together something that looks like it might work, but isn't well thought out.

▲

simonask 3 hours ago | parent [-]

This is magical thinking.

LLMs are physically incapable of generating something “well thought out”, because they are physically incapable of thinking.

	▲	Tossrock 19 minutes ago \| parent [-]
		Tell Donald Knuth that: https://www-cs-faculty.stanford.edu/~knuth/papers/claude-cyc...

▲

vntok 7 hours ago | parent | prev [-]

> An LLM has never saved me time. It has always produced something that doesn't quite work, has the rough shape of what I want, but somehow always gets all the details wrong.

This reads like a skill issue on your end, in part at least in the prompting side.

It does take time to reach a point where you can prompt an LLM sufficiently well to get a correct answer in one shot, developing an intuitive understanding of what absolutely needs to be written out and what can be inferred by the model.

▲

Jooror 6 hours ago | parent [-]

I’m curious about how you landed “git gud; prompt better” and not “maybe the domain I work in is a better fit for LLM code”. Or, to be a bit less generous, consider the possibility that the code you’re generating is boilerplate, marshaling, and/or API calls. A facade of perceived complexity over something that’s as complex as a filter-map or two.

▲

3371 6 hours ago | parent | next [-]

Sharing my 2 cents.

In the past 2 months I've been using all the SOTA models to help me design a new DSL for narrative scripting (such as game story telling) and a c# runtime implementation o the script player engine.

The language spec and design is about 95% authored by me up to this point; I have the LLMs work on the 2nd layer: the implementation specs/guidelines and the 3rd layer: concrete c# implementation.

Since it's a new language, I consider it's somewhat new/novel tasks for LLMs (at least, not like boilerplate stuff like HTTP API or CRUD service). I'd say, these LLMs have been very helpful - you can tell they sometimes get confused and have trouble to comply to the foreign language spec and design - but they are mostly smart enough to carry out the objectives, and they get better and better after the project got on track and has plenty of files/resources to read and reference.

And I'd also say "prompt better" is a important factor, just much more nuanced/complicated. I started with 0 experience with LLM agents and have learned a lot about how to tame them, and developed a protocol to collaborate with agents, these all comes from countless trial and errors, but in the end get boiled down to "prompt better".

▲

Jooror 2 hours ago | parent | next [-]

I wonder if my intuition here is correct; I would posit that “PL implementation” is a far more popular and well-explored field than it seems. How many toy/small/labor-of-love langs make it to Show HN? How many more simply don’t?

I’ve never personally caught the language implementation bug. I appreciate your perspective here.

	▲	3371 2 hours ago \| parent [-]
		I totally agree, and I was fully aware of how common people make language for fun when I replied. But I feel like the rationale would still stands: Considering LLMs' natures, common boilerplate tasks are easy because they can kind of just "decompress" from training data. But for a new language design, unless the language is almost identical to some other captured by the model, "decompression" would just fail.

▲

tovej 4 minutes ago | parent | prev [-]

I am prompting better. It doesn't help the LLM be more productive than me on a regular tuesday.

Sure, I can get the task done by delegating everything to an agentic workflow, but it just adds a bunch of useless overhead to my work.

I still need to know what the code does at the end of the day, so I can document it and reason about it. If I write the code myself, it's easy. If an LLM does it, it's a chore.

And even without those concerns, the LLM is still slower than me. Unless it's trivial boilerplate, in which case other tools serve me better and cheaper.

I'll note that a compiler is one of the most well understood and implemented software projects, much of it open source, which means the LLM has a lot of prior art that it can copy.

▲

rybosworld 5 hours ago | parent | prev | next [-]

When web search first arrived, the same thing happened. That is, some people didn't like using the tool because it wasn't finding what they wanted. This is still true for a lot of folks today, actually.

It's less "git gud; prompt better", and more, "be able to explain (well) what you want as the output". If someone messages the IT guy and says "hey my computer is broken" - what sort of helpful information can the IT guy offer beyond "turn it on and off again"?

▲

mikkupikku 5 hours ago | parent | prev | next [-]

> I’m curious about how you landed “git gud; prompt better” and not “maybe the domain I work in is a better fit for LLM code”.

1. Personal experience. Lazy prompting vs careful prompting.

2. They're coincidentally good at things I'm good at, and shit at things I don't understand.

3. Following from 2, when used by somebody who does understand a problem space which I do not, they easily succeed. That dog vibe coding games succeeded in getting claude to write games because his master knew a thing or two about it. I on the other hand have no game Dev experience, even almost no hobby experience with games specifically, so I struggle to get any game code that even remotely works.

▲

Jooror 2 hours ago | parent [-]

Irrespective of the domain you specifically listed in 3 (game dev is, believe it or not, one of the “more complex” domains), you have completely failed to miss the point.

> 2. They're coincidentally good at things I'm good at, and shit at things I don't understand.

This may well be! In the perfect world this would be balanced with the knowledge that maybe “the things you’re good at” are objectively* easier than “things you don’t understand”. Speaking for myself, I’m proficient in many more easy things than hard things.

*inasmuch as anything can be “objectively” easier

	▲	an hour ago \| parent [-]
		[deleted]

▲

vntok 5 hours ago | parent | prev [-]

The parent is specifically talking about producing boilerplate code -a domain in which LLM excell at- and not having had any success at that. It's therefore not a leap of logic to assume they haven't put (enough) effort into getting better at prompting first, which is perfectly fine per se but leans towards a skill issue and not an immutable property of gen AI.

The uncomfortable fact remains that one cannot really expect to get much better results from an LLM without putting some work themselves. They aren't magical oracles.