Remix.run Logo
simianwords 5 days ago

After having thought a long bit about why I find LLM's useful despite the high error rate: it is because of my ability to verify a certain result is high enough (my internal verifier model) and the generator model which is the LLM is also accurate enough. This is the same concept as red and blue team.

Its the same reason I find asking opinions from many people useful - I take every answer and try to fit it into my world model and see what sticks. The point that many miss is that each individual's verifier model is actually accurate enough so that external generator models may afford to have high error rates.

I have not yet completely explored how the internal "fitting" mechanism works but to give an example: I read many anecdotes from Reddit, fully knowing that many are astroturfed, some flat out wrong. But I still have tricks to identify what can be accurate, which I probably do subconsciously.

In reality: answers don't exist in a randomly uniform space. "Truth" always has some structure and it is this structure (that we all individually understand a small part of) that helps us tune our verifier model.

It is useful to think of how LLM's would work with varying levels of accuracy. For example, generating gibberish to GPT O3 to ground truth. Gibberish is so inaccurate that even extremely high levels of accuracy of our internal verifier model may not allow it to be useful. But O3 is high enough that combined with my internal verifier model it is generally useful.

davidhs 5 days ago | parent [-]

LLMs can be useful when you have access to a verifier or verification process.

simianwords 5 days ago | parent [-]

yes https://deepmind.google/discover/blog/alphaevolve-a-gemini-p...

Our internal verifier model is fuzzy but in this example I think it is pretty much always accurate.