Remix.run Logo
xyzsparetimexyz 4 hours ago

What kind of basic ass CRUD apps are people even working on that they're on stage 5 and up? Certainly not anything with performance, visual, embedded or GPU requirements.

IanCal 3 hours ago | parent | next [-]

I think you massively underestimate the number of useful apps that are crud and a bit of business logic and styling. They’re useful, can genuinely take time to build, can be unique every time, and yet not brand new research projects.

krackers an hour ago | parent | next [-]

A lot of stuff is simultaneously useful but not mission critical, which is where I think the sweet spot of LLMs currently lies.

In terms of the state of software quality, the bar has actually been _lowered_, in that even major user-facing bugs in operating systems are no longer a showstopper. So it's no surprise to me that people are vibe-coding things "in prod" that they actually sell to other people (some even theorize claude code itself is vibe-coded, hence its bugs. And yet that hasn't slowed down adoption because of the claude max lock in).

So maybe one alternate way to see the "productivity gains" from vibe-coding in deployed software is that it's actually a realization that quality doesn't matter. The seeds for this were already laid years back when QA vanished as a field.

LLMs occupy a new realm in the pareto frontier, the "slipshod expert". Usually humans grow from "sloppy incompetent newb" to the "prudent experienced dev". But now we have a strange situation where LLMs can write code (e.g. vectorized loops, cuda kernels) that could normally only be done by those with sufficient domain knowledge, and yet (ironically) it's not done with the attention and fastidiousness you'd expect from such an experienced dev.

xyzsparetimexyz 3 hours ago | parent | prev [-]

No totally, I agree. But I don't think that anyone will be YOLO vibe coding massive changes into Blender or ffmpeg any time soon.

IanCal 3 hours ago | parent [-]

Probably not, though additions maybe - I added the feature where the sculpt tool turns as you move it around if I recall right, many moons ago - I don’t think it was that hard but was a useful change.

tjr 4 hours ago | parent | prev [-]

What would be an example of something you think wouldn’t work with 5 or higher? Is there something about GPU programming that LLMs can’t handle?

xyzsparetimexyz 4 hours ago | parent [-]

I doubt they'd do a very good job of debugging a gpu crash, or visual noise caused by forgotten synchronization, or odd looking shadows.

Mayybe for some things you could set it up so that the screen output is livestreamed back into the agent, but I highly doubt that anyone is doing that for agents like this yet

throwup238 2 hours ago | parent | next [-]

> Mayybe for some things you could set it up so that the screen output is livestreamed back into the agent, but I highly doubt that anyone is doing that for agents like this yet

What do you mean by streaming? LLMs aren’t that advanced yet where they can consume a live video feed but people have been feeding them screenshots from Playwright and desktop apps for years (Anthropic even released the Computer Use feature based on this).

Gemini has the best visual intelligence but all three of the major models have supported this for a while. I don’t think it’d help with fixing subtle problems in shadows but it can fix other gui bugs using visual feedback.

jjmarr 2 hours ago | parent | prev [-]

I am a GPU programmer (on the compute side), and the biggest challenge is lack of tooling.

For host-side code the agent can throw in a bunch of logging statements and usually printf its way to success. For device-side code there isn't a good way to output debugging info into a textual format understandable by the agent. Graphical trace viewers are great for humans, not so great for AI right now.

On the other hand, Cline's harness can interact with my website and click on stuff until the bugs are gone.

akiselev 2 hours ago | parent [-]

(Shamless plug) I've been using my debugger-cli [1] to enable agents to debug code using debuggers that support the Debug Adaptor Protocol. It looks like cuda-gdb supports DAP so I'd love to add support. I just need help from someone who can test it adequately (kernels/warps/etc don't quite translate to a generic DAP client implementation).

[1] https://github.com/akiselev/debugger-cli

rescbr 23 minutes ago | parent [-]

This is great. I hate LLMs fiddling around with logging calls to get some debugging capability.

Now they can be promoted from junior coders into mid-level coders :)