Remix.run Logo
Retric 3 hours ago

I suspected you felt that way even though it hasn’t been my personal experience.

I’ve heard people say older models can’t do X, when I used that way etc. I suspect people are applying their own learning curve as part of their assessment of progress, you get better at writing prompts and it feels like the model improved.

Which is why I’m saying we need some objective metrics to judge predictions of actual capacity.