Remix.run Logo
whazor 5 hours ago

I mean, they train their model on their training data. So by it should score well on their own benchmark.