Remix.run Logo
abeppu a year ago

While it's great that this is open source, and I understand the pressure for smaller models that can be run in a wider range of contexts, I continue to be annoyed that authors keep posting comparisons to models which are slightly smaller.

In this page, SmolLM2-1.7B does a bit better than Qwen2.5-1.5B which is ahead of Llama3.2-1B. At the next size level up, in other comparisons I've seen that e.g. Phi-3.5 (which is ~3.8B params) does a bit better than Llama 3.2 3B. Gemma 2 has a 9B size, llama 3.1 has an 8B size and I think when that came out Mistral had a 7B model -- so whenever a new "small" thing does "better" than its peers, we can't easily see whether it's because of any of the many small choices that the authors made were actually better.