| ▲ | knollimar 9 days ago | ||||||||||||||||
I was like this for a bit and you still have memories from like 30 seconds to minutes ago, but after that you have a cliff where you don't remember. I don't think LLMs structurally even get the 30 seconds part. It's literally 0 for them. | |||||||||||||||||
| ▲ | mft_ 9 days ago | parent | next [-] | ||||||||||||||||
I'd argue that the context window is analogous to short-term memory. It's functional but limited in duration, and if you overload it, it starts to fail. It's the long-term memory (i.e. learned experiences feeding back and directly altering the content of the core brain, or model) that is missing. | |||||||||||||||||
| |||||||||||||||||
| ▲ | layer8 9 days ago | parent | prev [-] | ||||||||||||||||
It’s nonzero, because they carry state while performing inference, and in the surrounding processes like chain-of-thought and mixture-of-experts. | |||||||||||||||||
| |||||||||||||||||