| ▲ | criemen 2 days ago | |
It's hard for me to say. I don't think you know you're on the S-curve until after the fact. On the one hand, most models are "good enough" for chatgpt-like usage, and there it's hard to see/feel generation-to-generation improvements. On the other hand, if you look at instruction following, dealing with long context windows, >200 tool call interactions while staying on track, there's still plenty of improvements to be had. So, hard to say where we are. | ||