Remix.run Logo
vladimir_gor a day ago

I'm a big fan of benchmarks and now finally we have one to evaluate models on physical tasks. Will be interesting to see how fast this gap will narrow.