| ▲ | nextos 4 days ago | |
The advantage of small purpose-specific models is that they might be much more robust i.e., unlikely to generate wrong sequences for your particular domain. That is at least my experience working on this topic during 2025. And, obviously, smaller models mean you may deploy them on cheaper hardware, latency is reduced, energy consumption is lower, etc. In some domains like robotics, these two advantages might be very compelling, but it's obviously early to draw any long-term conclusions. | ||
| ▲ | larodi 4 days ago | parent [-] | |
I second this. Smaller models indeed may be much better positioned for fine-tuning for the very reason you point out - less noise to begin with. | ||