Remix.run Logo
naasking 5 hours ago

Sounds reasonable, but gains will go up. There is a ceiling somewhere, but we don't know where it is.

Insanity 5 hours ago | parent [-]

Yup, and the ceiling could be at 11% or at 50%. But my bet is closer to a lower-range ceiling than an upper-range. Model's are no longer revolutionary, they are evolutionary, and the evolution and per model-version difference is narrowing each release.

naasking 4 hours ago | parent [-]

> Model's are no longer revolutionary, they are evolutionary, and the evolution and per model-version difference is narrowing each release.

We've definitely culled some low hanging fruit, but I think there's still a lot of room for improvements that could lead to step changes in capabilities. I think we're only scratching the surface of looped language models, thinking in latent space, and multimodality.

And even if the per-model differences are narrowing, even single digit improvements in performance metrics could yield outsized effects in applicability and productivity. Consider services that guarantee one 9 of reliability vs. five 9s. In absolute terms that change is a trivia difference, but the increased reliability allows use in way, way more domains.