Remix.run Logo
ianbutler 3 days ago

https://www.anthropic.com/research/tracing-thoughts-language...

This article counters a significant portion of what you put forward.

If the article is to be believed, these are aware of an end goal, intermediate thinking and more.

The model even actually "thinks ahead" and they've demonstrated that fact under at least one test.

Robin_Message 3 days ago | parent [-]

The weights are aware of the end goal etc. But the model does not have access to these weights in a meaningful way in the chain of thought model.

So the model thinks ahead but cannot reason about it's own thinking in a real way. It is rationalizing, not rational.

senordevnyc 3 days ago | parent | next [-]

So the model thinks ahead but cannot reason about its own thinking in a real way. It is rationalizing, not rational.

My understanding is that we can’t either. We essentially make up post-hoc stories to explain our thoughts and decisions.

Zee2 3 days ago | parent | prev [-]

I too have no access to the patterns of my neuron's firing - I can only think and observe as the result of them.