Sounds reasonable, but gains will go up. There is a ceiling somewhere, but we don't know where it is.

Yup, and the ceiling could be at 11% or at 50%. But my bet is closer to a lower-range ceiling than an upper-range. Model's are no longer revolutionary, they are evolutionary, and the evolution and per model-version difference is narrowing each release.

	▲	naasking 4 hours ago \| parent [-]
		> Model's are no longer revolutionary, they are evolutionary, and the evolution and per model-version difference is narrowing each release. We've definitely culled some low hanging fruit, but I think there's still a lot of room for improvements that could lead to step changes in capabilities. I think we're only scratching the surface of looped language models, thinking in latent space, and multimodality. And even if the per-model differences are narrowing, even single digit improvements in performance metrics could yield outsized effects in applicability and productivity. Consider services that guarantee one 9 of reliability vs. five 9s. In absolute terms that change is a trivia difference, but the increased reliability allows use in way, way more domains.