Remix.run Logo
dlivingston 7 days ago

Reminds me of how Google's Genie 3 can only run for a ~minute before losing its internal state [0].

My gut feeling is that this problem won't be solved until some new architecture is invented, on the scale of the transformer, which allows for short-term context, long-term context, and self-modulation of model weights (to mimic "learning"). (Disclaimer: hobbyist with no formal training in machine learning.)

[0]: https://news.ycombinator.com/item?id=44798166

skydhash 7 days ago | parent [-]

It’s the nature of formal system. Someones need to actually do the work of defining those rules or have a smaller set of rules that can generate the larger set. But anytime you invent a rule. That means a few things that are possible can’t be represented in the system. You’re mostly hoping that those things aren’t meaningful.

LLMs techniques allows us to extract rules from text and other data. But those data are not representative of a coherent system. The result itself is incoherent and lacks anything that wasn’t part of the data. And that’s normal.

It’s the same as having a mathematical function. Every point that it maps to is meaningful, everything else may as well not exists.