Remix.run Logo
chris_st 4 hours ago

Well, to be fair, people cheat by remembering what they did last time. I think the idea here is to run the models from a "clean slate" and see how often they succeed/fail.

They are, like people, non-deterministic, so giving them several "fair" trials makes sense to me.