▲ | mdp2021 6 days ago | |||||||
As many repeated here, it's (generally) not for direct use. It is meant to be a good base for fine-tuning and getting something very fast. (In theory, if you fine-tuned Gemma3:270M over "templating cold calls to leads" it would become better than Qwen and faster.) | ||||||||
▲ | wanderingmind 6 days ago | parent [-] | |||||||
Why should we start fine tuning gemma when it is so bad. Why not instead focus the fine-tuning efforts on Qwen, when it starts off with much, much better outputs? | ||||||||
|