Remix.run Logo
thepasch 10 hours ago

It's not really "trying" to do anything. That they're, inherently, sequential matrix multipliers with clever data propagation should be uncontroversial, but I think stopping there is overly reductive.

Mechanistic interpretability research has found plenty of indicators that real, complex, generalized, and reusable circuits develop in models as they are trained and post-trained, particularly as overtraining ratios increase and memorization shifts to generalization. That's not to say that means they must be "conscious," but the overall point is that claiming anything definitive either way is incomplete.

It can be fascinating reading if you can sort through the chuff.