| ▲ | minimaxir 2 days ago |
| The benchmark comparisons to Gemma 3 27B on Hugging Face are interesting: The Gemma 4 E4B variant (https://huggingface.co/google/gemma-4-E4B-it) beats the old 27B in every benchmark at a fraction of parameters. The E2B/E4B models also support voice input, which is rare. |
|
| ▲ | regularfry 2 days ago | parent [-] |
| Thinking vs non-thinking. There'll be a token cost there. But still fairly remarkable! |
| |
| ▲ | DoctorOetker 2 days ago | parent [-] | | Is there a reason we can't use thinking completions to train non-thinking? i.e. gradient descent towards what thinking would have answered? | | |
| ▲ | joshred 2 days ago | parent [-] | | From what I've read, that's already part of their training. They are scored based on each step of their reasoning and not just their solution. I don't know if it's still the case, but for the early reasoning models, the "reasoning" output was more of a GUI feature to entertain the user than an actual explanation of the steps being followed. |
|
|