| ▲ | knollimar 9 days ago | |
The context window is so flawed that I wouldn't consider it memory. It feels like notes about the situation rather than it being in memory. Memory has more "attention". I think that "it starts to fail" is load bearing here. I feel like memory has like 5 parts, and LLMs are missing 2 of them: current working memory short term what is immediately happening without it being in "RAM". I differentiate here vs working in like thinking fast and slow. Keeping things in working memory is work! You can vibe away short term memory. I had excellent short term memory while I was messed up, I could keep time well. I think LLMs can do this with notes. mid term: Vague awareness of things like what day a week it is or what you did 2 hours ago. This is where my memory personally failed long term memory of experiences. You can fake this with memory.md generalized wisdom for pattern matching long term memories LLMs seem to be missing that part I was missing. Im probably projecting and anthropomorphizing. But i relate: I would confabulate a ton and didn't know anything was wrong for a while but things seemed off. Context is like working memory but not short term or mid term. I think you can imply short term with big enough context. My categories are purely anthropomorphic to me but just wanted to say where I disagreed. | ||
| ▲ | mft_ 8 days ago | parent [-] | |
Thanks for sharing your experience. It's really interesting that you describe a loss of some 'middle' parts but not others. The 'classic' medical/psychological model of memory has three parts (sensory/short-term/long-term), but it's also worth noting that that model was first devised in 1968! > long term memory of experiences. You can fake this with memory.md Not sure about this; to my mind, memory.md is analogous to humans making lists of things to not forget to do, or notes from a lecture to learn (i.e. cram into long-term memory) later on. LLMs use it as a short cut to bring important facts back into their context window; but it's not the same as them already 'knowing' the information via the original training process. --- My consistent (hot?) take is that a (the?) major piece holding LLMs back (maybe even from AGI?) is continual learning. Humans have systems for continually updating their long-term memory from their lived experience - new facts, processes, skills, successes, mistakes, etc. (Sleep and dreaming are centrally involved in this process.) The current architecture of LLMs makes this practically impossible, as it would presumably require the level of power currently necessary for training to be continually applied for continual learning, and demonstrates the huge efficiency advantage of the biological brain. | ||