| ▲ | bratao 7 hours ago | |||||||||||||
It is super strange that all last (3?) releases they keep comparing older models such as Opus-4.6. | ||||||||||||||
| ▲ | vessenes 7 hours ago | parent | next [-] | |||||||||||||
Some of it’s probably timing. Some of it is wanting to look good. That said, I just went to the claw-eval site, and neither 4.7 nor 5.5 from oAI are listed on the benchmarks. So there’s also just the time from others to get benchmarking done and published. | ||||||||||||||
| ▲ | varispeed 6 hours ago | parent | prev | next [-] | |||||||||||||
Opus-4.6 was probably the best model so far before it got nerfed. 4.7 is nowhere near experience I had. In fact I stopped using it completely because more often than not its output is just dumber than local models. | ||||||||||||||
| ||||||||||||||
| ▲ | dyauspitr 5 hours ago | parent | prev [-] | |||||||||||||
Because these can’t compete with the SoTA but they’re close. | ||||||||||||||