| ▲ | OsrsNeedsf2P 5 hours ago | |
I kinda feel like the goalposts are shifting. While we're not there yet, in a world where Chinese models surpass Western ones, HN will be nitpicking edge cases long after the ship sails | ||
| ▲ | Oras 5 hours ago | parent [-] | |
I don’t think it’s undermining the effort and improvement, but usability of these models aren’t usually what their benchmarks suggest. Last time there was a hype about GLM coding model, I tested it with some coding tasks and it wasn’t usable when comparing with Sonnet or GPT-5 I hope this one is different | ||