Remix.run Logo
K0balt 3 hours ago

Because I have 100 percent test coverage (of the software, some hardware edge cases pop up that aren’t documented in the data sheets), and over 10k hours of field deployment over 130 devices? This rollout has been much more bug free than any we have done in the last six years, and it’s the first that has been almost zero hand coded. (Our system is far from vibe coding however, there is a very strict pipeline)

I’m not saying that AI can solve every problem or that it is without problems (we spent hundreds of hours developing a concept to production pipeline just to make sure it doesn’t go off the rails)

But the net result is that a good senior dev with an acutely olfactory paranoia can supervise a production pipeline and produce efficient, maintainable code at a much faster rate (and ridiculously lower cost) that he was doing before supervising 3 or 4 devs on a complex hardware project. I can’t speak for other types of development, but our applications devs are also leveraging AI code generation and it -seems- to be working out.

Now, where those senior devs are going to come from in the future… that imho is a huge problem. It’s definitely some flavor of eating the goose that lays the golden egg here.

ACCount37 3 hours ago | parent [-]

It's blindingly obvious what the big bet is. The senior devs are going to come from the next generations of AI systems.

K0balt an hour ago | parent [-]

That’s the big bet, for sure… but if it’s reasoning that the supervising devs are injecting, and ai systems can’t reason, I guess it won’t work? Idk, I kinda think they do reason, though not in the way people might think.

It’s definitely true that they are statistical next token predictors, and that is intrinsically pattern matching, and reasonable to say not capable of reasoning.

But my intuition is that that is not really what is going on. The token prediction is the hardware layer. The software is the sum total of collective human culture they are trained on. The software is doing the reasoning, not the hardware. Like a Z80 can’t play chess, but software that runs on a Z80 certainly can.

Idk, that’s my -feeling- on the conundrum. Who knows, I guess we will find out.

ACCount37 16 minutes ago | parent [-]

If the easiest pathway to high performance next token prediction lies through reasoning, then training for better next token prediction ends up training for reasoning implicitly.

By now, there's every reason to believe that this is what's happening in LLMs.

"Reasoning primitives" are learned in pre-training - and SFT and RL then assemble them into high performance reasoning chains, converting "reasoning as a side effect of next token prediction" to "reasoning as an explicit first class objective".

The end result is quite impressive. By now, it seems like the gap between human reasoning and LLM reasoning isn't "an entirely different thing altogether" - it's "humans still do it better at the very top end of the performance curve - when trained for the task and paying full attention".