Remix.run Logo
simonw 2 hours ago

Because it's an extremely large and complex project that is also very clearly specified, to the point that the three word prompt "build a browser" encapsulates a huge amount of detail.

Similar to "build space invaders", another useful test prompt for seeing how well an LLM can do at a medium complexity task without having to give it a great deal of instruction.

I called building a browser the "hello world" of complex parallel agent coding harnesses the other day: https://simonwillison.net/2026/Jan/23/fastrender/#a-single-e...

fruitworks 2 hours ago | parent [-]

I am not convinced either of these are good test prompts for generic complexity tasks. Many solutions have already been included in the training data!

You can trivially produce a web browser by copying and compiling the code for firefox, no transformer needed.

baxtr an hour ago | parent | next [-]

Can still be a good capability test. Building a car is a real world equivalent. It’s highly complex and has been done billions of times. Still hard to pull off if you ask me.

Choco31415 2 hours ago | parent | prev [-]

But that would produce Firefox.

The goal with these tests is to see if the models can make something new, not just copy an existing solution.

That is the goal, at least.

aix1 2 hours ago | parent | next [-]

But how do you define, or indeed assess, novelty?

It's not that difficult to take an existing mature codebase and morph it such that it looks quite different but is functionally unchanged.

This is a very different task than building something that's not been built before.

chihuahua 23 minutes ago | parent | prev [-]

Obviously Microsoft felt bad when they had to kill the old Edge browser that was based on their own HTML rendering engine. Must feel like a second-rate tech company when you can't write some code to render HTML with sufficient quality.

Now they can get back in the game with a 3-word prompt!

And then every time there's some change to web standards, it's just one more prompt where you say "Hey Copilot, take a look at this page that describes the change, and update our browser code to add this!"

/s