Remix.run Logo
photonthug 6 days ago

> General intelligence may not be SAT/SMT solving but it has to be able to do it, hence, backtracking.

Just to add some more color to this. For problems that completely reduce to formal methods or have significant subcomponents that involve it, combinatorial explosion in state-space is a notorious problem and N variables is going to stick you with 2^N at least. It really doesn't matter whether you think you're directly looking at solving SAT/search, because it's too basic to really be avoided in general.

When people talk optimistically about hallucinations not being a problem, they generally mean something like "not a problem in the final step" because they hope they can evaluate/validate something there, but what about errors somewhere in the large middle? So even with a very tiny chance of hallucinations in general, we're talking about an exponential number of opportunities in implicit state-transitions to trigger those low-probability errors.

The answer to stuff like this is supposed to be "get LLMs to call out to SAT solvers". Fine, definitely moving from state-space to program-space is helpful, but it also kinda just pushes the problem around as long as the unconstrained code generation is still prone to hallucination.. what happens when it validates, runs, and answers.. but the spec was wrong?

Personally I'm most excited about projects like AlphaEvolve that seem fearless about hybrid symbolics / LLMs and embracing the good parts of GOFAI that LLMs can make tractable for the first time. Instead of the "reasoning is dead, long live messy incomprehensible vibes", those guys are talking about how to leverage earlier work, including things like genetic algorithms and things like knowledge-bases.[0] Especially with genuinely new knowledge-discovery from systems like this, I really don't get all the people who are still staunchly in either an old-school / new-school camp on this kind of thing.

[0]: MLST on the subject: https://www.youtube.com/watch?v=vC9nAosXrJw

PaulHoule 5 days ago | parent [-]

When I was interested in information extraction I saw the problem of resolving language to a semantic model [1] as containing an SMT problem. That is, words are ambiguous, sentences can parse different ways, you have to resolve pronouns and explicit subjects, objects and stuff like that.

Seen that way the text is a set of constraints with a set of variables for all the various choices you make determining it. And of course there is a theory of the world such that "causes must precede their effects" and all the world knowledge about instances such as "Chicago is in Illinois".

The problem is really worse than that because you'll have to parse sentences that weren't generated by sound reasoners or that live in a different microtheory, deal with situations that are ambiguous anyway, etc. Which is why that program never succeeded.

[1] in short: database rows