| ▲ | scirob 10 days ago | |
Quickly deployed it to check some benchmarks relevant for German language. These are results for CohereLabs/include-base-44 german only : Gemma 4 12B %61.9
The quwen 3 14B vs Gemma 4 12B difference is within random variance they same in some repeat runs they actually got the exact same score. Next step up Gemma 4 31B gets 0.676 on this. Or let in some reasoning Qwen 3 14B (reasoning) 0.676.I'll run some cheat-proof benchmarks ones tomorrow see if qwen is still on top. | ||
| ▲ | kordlessagain 9 days ago | parent [-] | |
I just ran a short tool use test and it's doing pretty well. | ||