▲ | nh43215rgb 6 days ago | ||||||||||||||||
270M is nice (and rare) addition. Is there a reason why this is not categorized as gemma3n model? I thought small models go under gemma3n category | |||||||||||||||||
▲ | rao-v 6 days ago | parent [-] | ||||||||||||||||
Not at Google (anymore), but Gemma3n is a radically different (and very cool) architecture. The MatFormer approach essentially lets you efficiently change how many parameters of the model you use while inferencing. The 2B model they released is just the sub model embedded in the original 4B model. You can also fiddle with the model and pull a 2.5 or 3B version pu also! This is a more traditional LLM architecture (like the original Gemma 3 4B but smaller) and trained on an insane (for the size) number of tokens. | |||||||||||||||||
|