| ▲ | LoganDark 8 hours ago | |||||||||||||||||||||||||||||||
What you're describing is not finding flaws in code. It's summarizing, which current models are known to be relatively good at. It is true that models can happen to produce a sound reasoning process. This is probabilistic however (moreso than humans, anyway). There is no known sampling method that can guarantee a deterministic result without significantly quashing the output space (excluding most correct solutions). I believe we'll see a different landscape of benefits and drawbacks as diffusion language models begin to emerge, and as even more architectures are invented and practiced. I have a tentative belief that diffusion language models may be easier to make deterministic without quashing nearly as much expressivity. | ||||||||||||||||||||||||||||||||
| ▲ | nielsole 8 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||
> moreso than humans Citation needed. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | MrOrelliOReilly 7 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
This all sounds like the stochastic parrot fallacy. Total determinism is not the goal, and it not a prerequisite for general intelligence. As you allude to above, humans are also not fully deterministic. I don't see what hard theoretical barriers you've presented toward AGI or future ASI. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | michaelscott 7 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||
Nothing you've said about reasoning here is exclusive to LLMs. Human reasoning is also never guaranteed to be deterministic, excluding most correct solutions. As OP says, they may not be reasoning under the hood but if the effect is the same as a tool, does it matter? I'm not sure if I'm up to date on the latest diffusion work, but I'm genuinely curious how you see them potentially making LLMs more deterministic? These models usually work by sampling too, and it seems like the transformer architecture is better suited to longer context problems than diffusion | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||