Author here.

To be upfront about what this is: I'm not a Rust developer or a PHP internals person. This is an experiment in whether the "point the AI at the original project's test suite" methodology (the way Bun was driven against real-world suites) holds up when the human can't review the code. The oracle is php-src's own .phpt corpus, ~22k tests I didn't write. Current honest score: 3,844 passing (17.4%), with a realistic ceiling around 40-45% since the rest tests C extensions (GD, curl, intl, etc.) that are out of scope.

"Renders WordPress" means: fresh install completes into SQLite, the front page renders with real posts, a real theme and /wp-admin/ renders without issues. The REST API is untested, and it's currently ~55x slower than PHP on the front page (a bytecode VM is in progress, micro-benchmarks are already at 1-3x of PHP 8.5).

The scoreboard auto-generates into the repo after every run, whether the number went up or down.

Happy to answer anything.

▲

adamtaylor_13 3 hours ago | parent | next [-]

This is a pretty cool experiment. Thanks for sharing!

▲

3 hours ago | parent | prev | next [-]

[deleted]

▲

pluc 2 hours ago | parent | prev | next [-]

Compare with FrankenPHP?

▲

bbg2401 3 hours ago | parent | prev [-]

Will you answer questions yourself, or will you simply pass on what your LLM of choice writes for you?

Edit: On further inspection, the blog design, the blog build, the blog articles and even the anecdotes used in the articles are entirely Claude generated.

Stop being so lazy. Get Claude to do something interesting and use your own intellect to assess and challenge the work in your write up. Or the other way around. Inject some amount of human work, at least. Otherwise, what's the point in sharing?

▲

cataphract 40 minutes ago | parent | next [-]

The "honest score" is the most annoying claudism of the comment, with the short disjoint sentences a close second.

	▲	superdisk 29 minutes ago \| parent [-]
		It was "I need you to sit with:" that immediately made me close the article. I like LLM programming, but I really don't understand why so many people just post LLM-generated articles. What did the human even do at that point, press the start button?

▲

ShinyLeftPad 3 hours ago | parent | prev [-]

> will you simply pass on what your LLM of choice writes for you?

But it will be as least 17% correct!