| ▲ | YetAnotherNick 3 hours ago | |
The bigger issue is that they count small based on fixed number of parameters, and not the active parameter for MoE, didn't account for any hardware improvements etc. If they counted small based on the price or computational cost, I think they would have seen increase in small models. | ||