| ▲ | zhangchen 11 hours ago | |||||||||||||||||||||||||||||||
Has anyone tried implementing something like System M's meta-control switching in practice? Curious how you'd handle the reward signal for deciding when to switch between observation and active exploration without it collapsing into one mode. | ||||||||||||||||||||||||||||||||
| ▲ | robot-wrangler 10 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||
> Curious how you'd handle the reward signal for deciding when to switch between observation and active exploration without it collapsing into one mode. If you like biomimetic approaches to computer science, there's evidence that we want something besides neural networks. Whether we call such secondary systems emotions, hormones, or whatnot doesn't really matter much if the dynamics are useful. It seems at least possible that studying alignment-related topics is going to get us closer than any perspective that's purely focused on learning. Coincidentally quanta is on some related topics today: https://www.quantamagazine.org/once-thought-to-support-neuro... | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | claud_ia 3 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||
[dead] | ||||||||||||||||||||||||||||||||