▲ | PaulHoule 6 days ago | |||||||
If you want general-purpose generation than it has to be able to respect constraints (e.g. figure art of a person has 0..1 belly buttons, 0..2 legs is unspoken) as it is generative models usually get those things right but don't always if they can stick together the tiles they use internally in some combination that makes sense locally but not globally. General intelligence may not be SAT/SMT solving but it has to be able to do it, hence, backtracking. Today I had another of those experiences of the weaknesses of LLM reasoning, one that happens a lot when doing LLM-assisted coding. I was trying to figure out how to rebuild some CSS after the HTML changed for accessibility purposes and got a good idea for how to do it from talking to the LLM but at that point the context was poisoned, probably because there was a lot of content about the context describing what we were thinking about at different stages of the conversation which evolved considerably. It lost its ability to follow instructions and I'd tell it specifically to do this or do that and it just wouldn't do it properly and this happens a lot if a session goes on too long. My guess is that the attention mechanism is locking on to parts of the conversation which are no longer relevant to where I think we're at and in general the logic that considers the variation of either a practice (instances) or a theory over time is a very tricky problem and 'backtracking' is a specific answer for maintaining your knowledge base across a search process. | ||||||||
▲ | photonthug 6 days ago | parent | next [-] | |||||||
> General intelligence may not be SAT/SMT solving but it has to be able to do it, hence, backtracking. Just to add some more color to this. For problems that completely reduce to formal methods or have significant subcomponents that involve it, combinatorial explosion in state-space is a notorious problem and N variables is going to stick you with 2^N at least. It really doesn't matter whether you think you're directly looking at solving SAT/search, because it's too basic to really be avoided in general. When people talk optimistically about hallucinations not being a problem, they generally mean something like "not a problem in the final step" because they hope they can evaluate/validate something there, but what about errors somewhere in the large middle? So even with a very tiny chance of hallucinations in general, we're talking about an exponential number of opportunities in implicit state-transitions to trigger those low-probability errors. The answer to stuff like this is supposed to be "get LLMs to call out to SAT solvers". Fine, definitely moving from state-space to program-space is helpful, but it also kinda just pushes the problem around as long as the unconstrained code generation is still prone to hallucination.. what happens when it validates, runs, and answers.. but the spec was wrong? Personally I'm most excited about projects like AlphaEvolve that seem fearless about hybrid symbolics / LLMs and embracing the good parts of GOFAI that LLMs can make tractable for the first time. Instead of the "reasoning is dead, long live messy incomprehensible vibes", those guys are talking about how to leverage earlier work, including things like genetic algorithms and things like knowledge-bases.[0] Especially with genuinely new knowledge-discovery from systems like this, I really don't get all the people who are still staunchly in either an old-school / new-school camp on this kind of thing. [0]: MLST on the subject: https://www.youtube.com/watch?v=vC9nAosXrJw | ||||||||
| ||||||||
▲ | XenophileJKO 6 days ago | parent | prev [-] | |||||||
What if you gave the model a tool to "willfully forget" a section of context. That would be easy to make. Hmm I might be onto something. | ||||||||
|