Remix.run Logo
raincole 2 hours ago

Every bit of improvement on AI ability will have the corresponding denial phrase. Some people still think AI can't generate the correct number of fingers today.

halJordan an hour ago | parent [-]

I love to hate it when someone unironically thinks asking an llm how many letters are in a word is a good test

Jerrrrrrrry 7 minutes ago | parent [-]

It is a good test now, for reasoning models.

It was a terrible test for pure tokenized models, because the logit that carries the carry digit during summation has a decent chance at getting lost.

SOTA models should reason to generate a function that returns the count of a given character, evaluate the function with tests, and use it for the output.