Remix.run Logo
criemen 2 days ago

It's hard for me to say. I don't think you know you're on the S-curve until after the fact.

On the one hand, most models are "good enough" for chatgpt-like usage, and there it's hard to see/feel generation-to-generation improvements. On the other hand, if you look at instruction following, dealing with long context windows, >200 tool call interactions while staying on track, there's still plenty of improvements to be had. So, hard to say where we are.