…but they reason well enough given enough context (using their matmuls).

To this day frontier models think that A and not B means A and B when the sentence gets pushed far enough back in their context window. The context length that model can reason over without obvious errors is much smaller than the advertised context. Between a 1/4th to a 1/20th what is advertised on the tin.

▲

antonvs 2 hours ago | parent | next [-]

Critiques like this tend to focus very hard on what models can't do. It's true, they have limitations.

But they're also superhuman in so many other ways. It's valid to point out limitations, but that doesn't support the conclusion that models are not incredibly powerful and capable of the functional equivalent of reasoning at human or superhuman levels in many scenarios.

▲

Npovview 4 hours ago | parent | prev [-]

Do you also happen to remember what you ate last thrusday?

▲

leecommamichael 4 hours ago | parent | next [-]

Is that the same gap as what you’re responding to? To me, it seems his critique is about advertised capability and logical statements, and your rhetorical(?) question is about memory.

▲

UncleEntity 2 hours ago | parent | prev [-]

"If you have a question look in the specification for the answer and don't just guess" seems a fairly important thing to remember for more than a couple of minutes...

	▲	Npovview an hour ago \| parent [-]
		I had a coding session where I was doing stuff across two repositories. And CC forgot in exactly which repository a particular file was so it was grepping the parent directory. I just asked it to write all important key-value pairs which it thinks are important to a file and it never did parent directory grepping.