| ▲ | johnfn 7 hours ago | |
The author says that he runs both the reference implementation and the new Rust implementation through 2 million (!) randomly generated battles and flags every battle where the results don't line up. | ||
| ▲ | simonw 7 hours ago | parent | next [-] | |
This is the key to the whole thing in my opinion. If you ask a coding agent to port code from one language to the another and don't have a robust mechanism to test that the results are equivalent you're inevitably going to waste a lot of time and money on junk code that doesn't work. | ||
| ▲ | Herring 6 hours ago | parent | prev [-] | |
Yeah and he claims a pass rate of 99.96%. At that point you might be running into bugs in the original implementation. | ||