I haven’t studied the project that this is a comment on, but: The article notices that something that compiles, runs, and renders a trivial HTML page might be a good starting point, and I would certainly agree with that when it’s humans writing the code. But is it the only way? Instead of maintaining “builds and runs” as a constant and varying what it does, can it make sense to have “a decent-sized subset of browser functionality” as a constant and varying the “builds and runs” bit? (Admittedly, that bit does not seem to be converging here, but I’m curious in more general terms.)

▲

johntb86 10 hours ago | parent | next [-]

In theory you could generate a bunch of code that seems mostly correct and then gradually tweak it until it's closer and closer to compiling/working, but that seems ill-suited to how current AI agents work (or even how people work). AI agents are prone to make very local fixes without an understanding of wider context, where those local fixes break a lot of assumptions in other pieces of code.

It can be very hard to determine if an isolated patch that goes from one broken state to a different broken state is on net an improvement. Even if you were to count compile errors and attempt to minimize them, some compile errors can demonstrate fatal flaws in the design while others are minor syntax issues. It's much easier to say that broken tests are very bad and should be avoided completely, as then it's easier to ensure that no patch makes things worse than it was before.

	▲	eloisius 9 hours ago \| parent [-]
		> generate a bunch of code that seems mostly correct and then gradually tweak it until it's closer and closer to compiling/working The diffusion model of software engineering

▲

madeofpalk 10 hours ago | parent | prev [-]

...What use is code if it doesn't build and run? What other way is there to build a browser that doesn't involved 'build and run'?

Writing junk in a text file isn't the hard part.

▲

Pinus 9 hours ago | parent [-]

Obviously, it has to eventually build and run if there’s to be any point to it, but is it necessary that every, or even any, step along the way builds and runs? I imagine some sort of iterative set-up where one component generates code, more or less "intelligently", and others check it against the C, HTML, JavaScript, CSS and what-have-you specs, and the whole thing iterates until all the checking components are happy. The components can’t be completely separate, of course, they’d have to be more or less intermingled or convergence would be very slow (like when lcamtuf had his fuzzer generate a JPEG out of an empty file), but isn’t that basically what (large) neural networks are; tangled messes of interconnected functions that do things in ways too complicated for anyone to bother figuring out?

▲

malfist 8 hours ago | parent | next [-]

How do you iteratively improve a broken codebase that doesn't compile with more than 3 million lines of code?

	▲	brabel 7 hours ago \| parent [-]
		I don't want to defend the AI slop, but it's common for me to go on for a few weeks without being able to compile everything when doing something realy big. I can still compile individual modules and run their tests, but not the full application (which puts all modules together)... but it may take a lot of time until all modules can come together and actually run the app.

▲

fwip 8 hours ago | parent | prev [-]

Human brains are big, tangled messes of interconnected neurons that do things in way too complicated to figure out.

That doesn't mean we can usefully build software that is a big, tangled mess.