Remix.run Logo
paulddraper an hour ago

Last year, I saw LLMs do well on the first week and accuracy drop off after that.

But as others have said, it’s a night and day difference now, particularly with code execution.