Remix.run Logo
bloppe 9 hours ago

Humans can also count the number of Rs in strawberry, but good luck proving that mathematically

EagnaIonat 9 hours ago | parent | next [-]

LLMs don't see words. They see tokens, which is why previously they had a hard time counting the r's.

You can certainly prove that mathematically, and giving that proof to an LLM it will give you the correct answer. Which is a prompting technique btw to improve accuracy of an LLMs results.

Although the more recent LLMs have built in approaches to answer this question.

Gemma4 for example:

-----

The user is asking for the number of 'r's in the word "strawberry".

* S

* T

* *R* (1)

* A

* W

* B

* E

* *R* (2)

* *R* (3)

* Y

There are 3 'r's.

-----

GPT-OSS-120b example.

-----

The user asks: "How many r's in strawberry." This is a straightforward counting problem: count the letter 'r' in the word "strawberry". The word "strawberry" contains letters: s t r a w b e r r y. Count r's: there are three 'r's? Let's check: s(1) t(2) r(3) a(4) w(5) b(6) e(7) r(8) r(9) y(10). Actually the word is "strawberry". Let's write out: s, t, r, a, w, b, e, r, r, y. So there are three r's: at positions 3, 8, 9. Yes, three r's. So answer: 3.

But we may also note that "r" appears three times.

Thus answer: 3.

We can provide a short answer.

Thus final: There are three r's in "strawberry".

----

kang 7 hours ago | parent [-]

Doubt if you can make a dumb model smart by feeding it proofs

EagnaIonat 5 hours ago | parent [-]

https://www.promptingguide.ai/techniques/knowledge

Tade0 4 hours ago | parent [-]

Sohnds like a great way to fill up the context before you even start.

falcor84 3 hours ago | parent [-]

Yes, what's your point? That is literally what it does - it adds relevant knowledge to the prompt before generating a response, in order to ground it me effectively.

Tade0 19 minutes ago | parent [-]

My point is that this doesn't scale. You want the LLM to have knowledge embedded in its weights, not prompted in.

tacotime 9 hours ago | parent | prev [-]

I doubt it is possible to mathematically prove much inside of a black box of billions of interconnected weights. But at least in the narrow case of the strawberry problem, it seems likely that LLM inference could reliably recognizing that sort of problem as the type that would benefit from a letter counting tool call as part of the response.