| ▲ | NetOpWibby 2 hours ago | |
How are they able to compare with Fable when Fable was only available for three days? | ||
| ▲ | Topfi 2 hours ago | parent [-] | |
Terminalbench numbers are publicly available. What is more interesting, why is that the only benchmark they highlight. Maybe 5.6 isn’t that far ahead of Fable 5 in DeepSWE and FrontierCode (which I consider the most useful and close to my evals + subjective experience)… | ||