Remix.run Logo
WhyOhWhyQ 4 days ago

Isn't this in contradiction to your blog post from yesterday though? It's impossible to prove a complex project made in 4.5 hours works. It might have passed 9000 tests, but surely there are always going to be edge cases. I personally wouldn't be comfortable claiming I've proved it works and saying the job is done even, if the LLM did the whole thing and all existing tests passed, until I played with it for several months. And even then I would assume I would need to rely on bug reports coming in because it's running on lots of different systems. I honestly don't know if software is ever really finished.

My takeaway from your blog post yesterday was that with a robust enough testing system the LLM can do the entire thing while I do Christmas with the family.

(Before all the AI fans come in here. I'm not criticizing AI.)

simonw 4 days ago | parent | next [-]

That's why I don't consider my blog post from yesterday to be production quality code. I'd need to invest a lot more work in reviewing it before I staked my reputation on it

BeefySwain 4 days ago | parent | prev [-]

Consider that this isn't just a random AI slopped assortment of 9,000 tests, but instead is a robust suite of tests that cover 100% of the HTML5 spec.

Does this guarantee that it functions completely with no errors whatsoever? Certainly not. You need formal verification for that. I don't think that contradicts what Simon was advocating for though in this post.

WhyOhWhyQ 4 days ago | parent [-]

I think it would be interesting if professional engineering becomes more like producing formally correct documents for the AI to implement.

ncruces 4 days ago | parent [-]

We have these tools that we use to write formally correct documents.

They're called programing languages, and a deterministic algorithm translates them to machine code.

Are we sure English and a probabilistic algorithm is any better at this?

WhyOhWhyQ 4 days ago | parent [-]

I actually hate AI in my core, to the point that if it gets too much more advanced I'll likely be in existential crisis, so don't attack me on those grounds. Given it exists, I'm going to find what's good about it though. I do think the problem of AI existing has to be confronted. Maybe one solution is what the human does is produce specs like the HTML 5 one, and what the AI does is implement it in software.