Remix.run Logo
andriy_koval 7 hours ago

I think it will make results way better and more representative of model abilities..

simonw 7 hours ago | parent [-]

It would... but the test is inherently silly, so I'm still not sure if it's worth me investing that extra effort in it.