Remix.run Logo
ACCount37 5 days ago

That's a really complex, very out-of-distibution, hard-to-know question for the early LLMs. Not that it's too hard to fix that, mind.

Those LLMs weren't very aware of tokenizer limitations - let alone aware enough to recognize them or work around them in the wild.

lapcat 5 days ago | parent [-]

> That's a really complex, very out-of-distibution, hard-to-know question

No, it's not. It's a trivial question in any context.

> for the early LLMs.

Early? Claude 3.7 was introduced just 6 months ago, and Deepseek-V3 9 months ago. How is that "early"?

ACCount37 5 days ago | parent [-]

Do I really have to explain what the fuck a "tokenizer" is, and why does this question hit the tokenizer limitations? And thus requires extra metacognitive skills for an LLM to be able to answer it correctly?

lapcat 5 days ago | parent | next [-]

> Do I really have to explain what the fuck

Please respect the HN guidelines: https://news.ycombinator.com/newsguidelines.html

What you need to explain is your claim that the cited LLMs are "early". According to the footnotes, the paper has been in the works since at least May 2025. Thus, those LLMs may have been the latest at the time, which was not that long ago.

In any case, given your guidelines violations, I won't be continuing in this thread.

Jensson 5 days ago | parent | prev [-]

The only "metacognitive" skill it needs is to know how many D there are in every token, and sum those up. Humans are great at that sort of skill, which is why they can answer that sort of question even in languages where each letter is a group of sounds and not just one like Japanese katakana, that is not hard at all.

LLM are also really great at this skill when there is ample data for it. There is not a lot of data for "how many D in DEEPSEEK", so they fail that.