Remix.run Logo
Jooror 6 hours ago

I’m curious about how you landed “git gud; prompt better” and not “maybe the domain I work in is a better fit for LLM code”. Or, to be a bit less generous, consider the possibility that the code you’re generating is boilerplate, marshaling, and/or API calls. A facade of perceived complexity over something that’s as complex as a filter-map or two.

3371 6 hours ago | parent | next [-]

Sharing my 2 cents.

In the past 2 months I've been using all the SOTA models to help me design a new DSL for narrative scripting (such as game story telling) and a c# runtime implementation o the script player engine.

The language spec and design is about 95% authored by me up to this point; I have the LLMs work on the 2nd layer: the implementation specs/guidelines and the 3rd layer: concrete c# implementation.

Since it's a new language, I consider it's somewhat new/novel tasks for LLMs (at least, not like boilerplate stuff like HTTP API or CRUD service). I'd say, these LLMs have been very helpful - you can tell they sometimes get confused and have trouble to comply to the foreign language spec and design - but they are mostly smart enough to carry out the objectives, and they get better and better after the project got on track and has plenty of files/resources to read and reference.

And I'd also say "prompt better" is a important factor, just much more nuanced/complicated. I started with 0 experience with LLM agents and have learned a lot about how to tame them, and developed a protocol to collaborate with agents, these all comes from countless trial and errors, but in the end get boiled down to "prompt better".

Jooror 2 hours ago | parent | next [-]

I wonder if my intuition here is correct; I would posit that “PL implementation” is a far more popular and well-explored field than it seems. How many toy/small/labor-of-love langs make it to Show HN? How many more simply don’t?

I’ve never personally caught the language implementation bug. I appreciate your perspective here.

3371 2 hours ago | parent [-]

I totally agree, and I was fully aware of how common people make language for fun when I replied.

But I feel like the rationale would still stands: Considering LLMs' natures, common boilerplate tasks are easy because they can kind of just "decompress" from training data. But for a new language design, unless the language is almost identical to some other captured by the model, "decompression" would just fail.

tovej 6 minutes ago | parent | prev [-]

I am prompting better. It doesn't help the LLM be more productive than me on a regular tuesday.

Sure, I can get the task done by delegating everything to an agentic workflow, but it just adds a bunch of useless overhead to my work.

I still need to know what the code does at the end of the day, so I can document it and reason about it. If I write the code myself, it's easy. If an LLM does it, it's a chore.

And even without those concerns, the LLM is still slower than me. Unless it's trivial boilerplate, in which case other tools serve me better and cheaper.

I'll note that a compiler is one of the most well understood and implemented software projects, much of it open source, which means the LLM has a lot of prior art that it can copy.

rybosworld 5 hours ago | parent | prev | next [-]

When web search first arrived, the same thing happened. That is, some people didn't like using the tool because it wasn't finding what they wanted. This is still true for a lot of folks today, actually.

It's less "git gud; prompt better", and more, "be able to explain (well) what you want as the output". If someone messages the IT guy and says "hey my computer is broken" - what sort of helpful information can the IT guy offer beyond "turn it on and off again"?

mikkupikku 5 hours ago | parent | prev | next [-]

> I’m curious about how you landed “git gud; prompt better” and not “maybe the domain I work in is a better fit for LLM code”.

1. Personal experience. Lazy prompting vs careful prompting.

2. They're coincidentally good at things I'm good at, and shit at things I don't understand.

3. Following from 2, when used by somebody who does understand a problem space which I do not, they easily succeed. That dog vibe coding games succeeded in getting claude to write games because his master knew a thing or two about it. I on the other hand have no game Dev experience, even almost no hobby experience with games specifically, so I struggle to get any game code that even remotely works.

Jooror 2 hours ago | parent [-]

Irrespective of the domain you specifically listed in 3 (game dev is, believe it or not, one of the “more complex” domains), you have completely failed to miss the point.

> 2. They're coincidentally good at things I'm good at, and shit at things I don't understand.

This may well be! In the perfect world this would be balanced with the knowledge that maybe “the things you’re good at” are objectively* easier than “things you don’t understand”. Speaking for myself, I’m proficient in many more easy things than hard things.

*inasmuch as anything can be “objectively” easier

2 hours ago | parent [-]
[deleted]
vntok 5 hours ago | parent | prev [-]

The parent is specifically talking about producing boilerplate code -a domain in which LLM excell at- and not having had any success at that. It's therefore not a leap of logic to assume they haven't put (enough) effort into getting better at prompting first, which is perfectly fine per se but leans towards a skill issue and not an immutable property of gen AI.

The uncomfortable fact remains that one cannot really expect to get much better results from an LLM without putting some work themselves. They aren't magical oracles.