| ▲ | amarant 3 days ago | ||||||||||||||||
solve simple maths problems, for example the kind found in the game 4=10 [1] Doesn't necessarily have to reliably solve them, some of them are quite difficult, but llms are just comically bad at this kind of thing. Any kind of novel-ish(can't just find the answers in the training-data) logic puzzle like this is, in my opinion, a fairly good benchmark for "thinking". Until a llm can compete with a 10 year old child in this kind of task, I'd argue that it's not yet "thinking". A thinking computer ought to be at least that good at maths after all. [1] https://play.google.com/store/apps/details?id=app.fourequals... | |||||||||||||||||
| ▲ | simonw 3 days ago | parent [-] | ||||||||||||||||
> solve simple maths problems, for example the kind found in the game 4=10 I'm pretty sure that's been solved for almost 12 months now - the current generation "reasoning" models are really good at those kinds of problems. | |||||||||||||||||
| |||||||||||||||||