▲ | JKCalhoun 2 days ago | |||||||||||||||||||||||||
He recently posted a question he put to grok3 — a variation on the trick LLM question (my characterization) of "count the number of this letter in this word." Apparently this Achilles heel is a well-known LLM shortcoming. Weirdly though, I tried the same example he gave on lmarena and actually got the correct result from grok3, not what Gary got. So I am a little suspicious of his ... methodology? Since LLMs are not deterministic it's possible we are both right (or were testing different variations on the model?). But there's a righteousness about his glee in finding these faults in LLMs. Never hedging with, "but your results may vary" or "but perhaps they will soon be able to accomplish this." EDIT: the exact prompt (his typo 'world'): "Can you circle all the consonants in the world Chattanooga" | ||||||||||||||||||||||||||
▲ | jonny_eh 2 days ago | parent | next [-] | |||||||||||||||||||||||||
I think it's fair to say though that if your results may vary, and be wrong, then they're not reliable enough for many use-cases. I'd have to see his full argument though to see if that's what he was claiming. I'm just trying to be charitable here. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | th0ma5 2 days ago | parent | prev [-] | |||||||||||||||||||||||||
I don't see it as righteous glee but just hoping that people will see the problem with how you could even begin to be suspicious of him. If it is so easy to get something wrong when you're trying to be correct, or get something accidentally correct as you're trying to expose things that are wrong ... Then what are we really doing here with these things. | ||||||||||||||||||||||||||
|