Remix.run Logo
isaacremuant 4 days ago

Except people use the same examples like blueberry and strawberry, which were used months ago, as if they're current.

These models can also call Counter from python's collections library or whatever other algorithm. Or are we claiming it should be a pure LLM as if that's what we use in the real world.

I don't get it, and I'm not one to hype up LLMs since they're absolutely faulty, but the fixation over this example screams of lack of use.

ceejayoz 4 days ago | parent | next [-]

It’s such a great example precisely for that reason - despite efforts, it comes back every time.

insin 4 days ago | parent | prev [-]

It's the most direct way to break the "magic computer" spell in users of all levels of understanding and ability. You stand it up next to the marketing deliberately laden with keywords related to human cognition, intended to induce the reader to anthropomorphise the product, and it immediately makes it look as silly as it truly is.

I work on the internal LLM chat app for a F100, so I see users who need that "oh!" moment daily. When this did the rounds again recently, I disabled our code execution tool which would normally work around it and the latest version of Claude, with "Thinking" toggled on, immediately got it wrong. It's perpetually current.