| ▲ | zaptheimpaler 4 hours ago | |
I tried gemma-4-26B-A4B just to see if it could help me read/sort my emails on a relatively under-powered setup (16GB VRAM + 32GB RAM) and it's not going well.. the model burns 24K tokens just on searching for the right tool and then dumps the email contents into context - i tried to get it to use code-mode to save context but the code-mode implementation can't save files so it was useless and im going to try to switch to "ssh-mode" into my devbox container. Still relatively new to this, so I'm probably doing something wrong | ||
| ▲ | anana_ 4 hours ago | parent [-] | |
Perhaps try a different model? Just from anecdotal experience, I find that the Gemma models smaller than 31B do not tool call as often as they should. Some of the benchmarks appear to back this up [0] Of course, a lot depends how you are using it (inference parameters, harness, prompting, etc.), but the model is quite important too. [0]: https://artificialanalysis.ai/models/open-source/small?model... | ||