| ▲ | sleepyeldrazi 3 hours ago | |||||||||||||
Finetuning takes little resources, the base model training is the slow and expensive part. Architecturally 3.5 models are identical to their 3.6 counterparts, that is why there is a consensus that those are probably finetunes and not re-trained from scratch, like you will se many people publish their own on huggingface. | ||||||||||||||
| ▲ | genxy 3 hours ago | parent [-] | |||||||||||||
Understood, but look at their larger cadence over the years and the breadth of models. They are clearly not all finetunes. Meta for all its billions, doesn't have anything comparable. | ||||||||||||||
| ||||||||||||||