Remix.run Logo
wongarsu a day ago

Tbf, most of the "real benchmarks" have issues that are just as bad. Assessing LLM performance is just hard

oceansky a day ago | parent [-]

And personal too. Different engineers are using them for different use cases.