Remix.run Logo
visarga 3 days ago

Why train to pedal fast when we already got motorcycles? You are preparing for yesterday's needs. There will never be a time when we need to solve this manually like it's 2019. Even in 2019 we would probably have used Google, solving was already based on extensive web resources. While in 1995 you would really have needed to do it manually.

Instead of manual coding training your time is better invested in learning to channel coding agents, how to test code to our satisfaction, how to know if what AI did was any good. That is what we need to train to do. Testing without manual review, because manual review is just vibes, while tests are hard. If we treat AI-generated code like human code that requires a line-by-line peer review, we are just walking the motorcycle.

How do we automate our human in the loop vibe reactions?

oblio 2 days ago | parent | next [-]

> Why train to pedal fast when we already got motorcycles? You are preparing for yesterday's needs.

This is funny in the sense that in properly built urban environment bycicles are one of the best ways to add some physical activity in a time constrained schedule, as we're discovering.

philipwhiuk 2 days ago | parent | prev | next [-]

> Instead of manual coding training your time is better invested in learning to channel coding agents

All channelling is broken when the model is updated. Being knowledgeable about the foibles of a particular model release is a waste of time.

> how to test code to our satisfaction

Sure testing has value.

> how to know if what AI did was any good

This is what code review is for.

> Testing without manual review, because manual review is just vibes

Calling manual review vibes is utterly ridiculous. It's not vibes to point out an O(n!) structure. It's not vibes to point out missing cases.

If your code reviews are 'vibes', you're bad at code review

> If we treat AI-generated code like human code that requires a line-by-line peer review, we are just walking the motorcycle.

To fix the analogy you're not reviewing the motorcycle, you're reviewing the motorcycle's behaviour during the lap.

visarga 2 days ago | parent [-]

> This is what code review is for.

My point is that visual inspection of code is just "vibe testing", and you can't reproduce it. Even you yourself, 6 months later, can't fully repeat the vibe check "LGTM" signal. That is why the proper form is a code test.

ben_w 2 days ago | parent | prev [-]

Yes and no.

Yes, I recon coding is dead.

No, that doesn't mean there's nothing to learn.

People like to make comparisons to calculators rendering mental arithmetic obsolete, so here's an anecdote: First year of university, I went to a local store and picked up three items each costing less than £1, the cashier rang up a total of more than £3 (I'd calculated the exact total and pre-prepared the change before reaching the head of the queue, but the exact price of 3 items isn't important enough to remember 20+ years later). The till itself was undoubtedly perfectly executing whatever maths it had been given, I assume the cashier mistyped or double-scanned. As I said, I had the exact total, the fact that I had to explain "three items costing less than £1 each cannot add up to more than £3" to the cashier shows that even this trivial level of mental arithmetic is not universal.

I now code with LLMs. They are so much faster than doing it by hand. But if I didn't already have experience of code review, I'd be limited to vibe-coding (by the original definition, not even checking). I've experimented with that to see what the result is, and the result is technical debt building up. I know what to do about that because of my experience with it in the past, and I can guide the LLM through that process, but if I didn't have that experience, the LLM would pile up more and more technical debt and grind the metaphorical motorbike's metaphorical wheels into the metaphorical mud.

visarga 2 days ago | parent [-]

> But if I didn't already have experience of code review, I'd be limited to vibe-coding (by the original definition, not even checking).

Code review done visually is "just vibe testing" in my book. It is not something you can reproduce, it depends on the context in your head this moment. So we need actual code tests. Relying on "Looks Good To Me" is hand waving, code smell level testing.

We are discussing vibe coding but the problem is actually vibe testing. You don't even need to be in the AI age to vibe test, it's how we always did it when manually reviewing code. And in this age it means "walking your motorcycle" speed, we need to automate this by more extensive code tests.

ben_w 2 days ago | parent [-]

I agree that actual tests are also necessary, that code review is not enough by itself. As LLMs can also write tests, I think getting as close as is sane to 100% code coverage is almost the first thing people should be doing with LLM assistance (and also, "as close as is sane": make sure that it really is a question of "I thought carefully and have good reason why there's no point testing this" rather than "I'm done writing test code, I'm sure it's fine to not test this", because LLMs are just that cheap).

However, code review can spot things like "this is O(n^2) when it could be O(n•log(n))", or "you're doing a server round trip for each item instead of parallelising them" etc.

You can also ask an LLM for a code review. They're fast and cheap, and whatever the LLM catches is something you get without having to waste a coworker's time. But LLMs have blind spots, and more importantly all LLMs (being trained on roughly the same stuff in roughly the same way) have roughly the same blind spots, whereas human blind spots are less correlated and expand coverage.

And code smells are still relevant for LLMs. You do want to make sure they're e.g. using a centralised UI style system and not copy-pasting style into each widget, because duplication wastes tokens and is harder to correctly update with LLMs for much the same reason it is with humans: stuff gets missed during the process when it's copypasta.