| ▲ | ACCount37 7 hours ago | |||||||||||||||||||||||||
"Capability per parameter" is rising, but parameter count remains an advantage. And small models remain bad, because "good" is a rapidly moving target. A 2026 4B beats 2024 4B, but both are far behind the contemporary frontier. Which makes them bad. There is no such thing as "too much capability" - a "good" model is whatever the current frontier is. In 2024, a "good" model is one that can be trusted to write a 800 line script. In 2026, it's a model that can be trusted to do gnarly high-level planning and execution both. In 2028, it's going to be something like a model you can point at an extremely involved task, abandon, and have it report back with a "done" in 3 weeks. | ||||||||||||||||||||||||||
| ▲ | overfeed 5 hours ago | parent [-] | |||||||||||||||||||||||||
> A 2026 4B beats 2024 4B, but both are far behind the contemporary frontier. The thing about engineering is you don't just use the biggest bolt on the market on every bridge. > In 2024, a "good" model is one that can be trusted to write a 800 line script. In 2026, it's a model that can be trusted to do gnarly high-level planning and execution both This sounds a lot like having a single diamond-head hammer as the only tool in your toolbox. As suggested by the name, flash models are fast - sometimes I want to write the equivalent of fifty 800-line scripts. There is such a thing as good enough. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||