Remix.run Logo
rahen 3 hours ago

That's an idea I had a few months ago: after going through a compaction once the KV cache is nearing capacity, accumulate this knowledge into a dataset to fine-tune a LoRA during offline hours.

This would create a three-layer memory system:

- Stable long-term memory (initial base weights)

- Mid-term memory built from the compactions and replay buffers

- Short-term memory (KV cache)

Sleeping would just be a fancy term for consolidating and transferring information from one memory layer to another during offline hours. Maybe that's also what the brain does while sleeping.

chermi 3 hours ago | parent | next [-]

Wouldn't that just accelerate collapse? How much do you trust the outputs of the llm to provide trustworthy and valuable new information? I mean I understand distillation works. But that's much more structured and thoughtful than my sessions at least.

jack_pp 3 hours ago | parent | next [-]

We can trust the feedback we give it based on the output it provides.

ambicapter 3 hours ago | parent [-]

What kind of feedback are you giving? What's the reward function?

jack_pp an hour ago | parent [-]

Right now, no feedback since I don't run this system but our workflows could change to accommodate it

rahen 3 hours ago | parent | prev [-]

[dead]

DonHopkins 3 hours ago | parent | prev [-]

It's a network of computers with GPUs, so there's no reason it can't sleep at the same time it's awake. Just a continuous "sleeping" process going on in the background, incrementally updating the model. No need for the "thinking" process to be "unconscious" while the "sleeping" process runs. Anthropomorphism confuses everything. There's no such thing as "offline hours" because the Earth is a sphere and the United States is not the center of the universe.