Remix.run Logo
gwd 5 days ago

> I think the answer is pretty obvious given that LLM's can't learn at runtime - can't try out some reasoning generalization they may have arrived at, find that it doesn't work in a specific case, then explore the problem and figure it out for next time.

This is just a problem of memory. Supposing that an LLM did generate a genuinely novel insight, it could in theory they could write a note for itself so that next time they come online, they can read through a summary of the things they learned. And it could also write synthetic training data for itself so that the next time they're trained, that gets incorporated into its general knowledge.

OpenAI allows you to fine-tune GPT models, I believe. You could imagine a GPT system working for 8 hours in a day, then spending a bunch of time looking over all its conversation looking for patterns or insights or things to learn, and then modifying its own fine-tuning data (adding, removing, or modifying as appropriate), which it then used to train itself overnight, waking up the next morning having synthesized the previous day's experience.

HarHarVeryFunny 5 days ago | parent [-]

> This is just a problem of memory

How does memory (maybe later incorporated via fine tuning) help if you can't figure out how to do something in the first place ?

That would be a way to incorporate new declarative data at "runtime" - feedback to the AI intern as to what it is doing wrong. However, in order to do something effectively by yourself generally requires more than just new knowledge - it requires personal practice/experimentation etc, since you need to learn how to act based on the contents of your own mind, not that of the instructor.

Even when you've had enough practice to become proficient at a taught skill, you may not be able to verbalize exactly what you are doing (which is part of the teacher-student gap), so attempting to describe then capture that as textual/context "sensory input" is not always going to work.