Remix.run Logo
johnfn 7 hours ago

The author says that he runs both the reference implementation and the new Rust implementation through 2 million (!) randomly generated battles and flags every battle where the results don't line up.

simonw 7 hours ago | parent | next [-]

This is the key to the whole thing in my opinion.

If you ask a coding agent to port code from one language to the another and don't have a robust mechanism to test that the results are equivalent you're inevitably going to waste a lot of time and money on junk code that doesn't work.

Herring 6 hours ago | parent | prev [-]

Yeah and he claims a pass rate of 99.96%. At that point you might be running into bugs in the original implementation.