▲ | dustrider 4 days ago | |
Move beyond benchmarks… proceed to list a bunch of benchmarks. The problem for me is that it’s not worth running these myself, yeah I may pay attention to which model is better at tool calling. But what matters is how well it does at my use case. |