▲ | jmaker 4 days ago | |
There are things it’s great at and things it deceives you with. In many things I needed it to check something for me I knew was a problem, o3 kept insisting it were possible due to reasons a,b,c, and thankfully gave me links. I knew it used to be a problem so surprised I followed the links only to read black on white it still wasn’t. So I explained to o3 that it’s wrong. Two messages later we were back at square one. One week later it didn’t update its knowledge. Months later it’s still the same. But at things I have no idea about like medicine it feels very convincing. Am I in hazard? People don’t understand Dunning-Kruger. People are prone to biases and fallacies. Likely all LLMs are inept at objectivity. My instructions to LLMs are always strictness, no false claims, Bayesian likelihoods on every claim. Some modes ignore the instructions voluntarily, while others stick strictly to them. In the end it doesn’t matter when they insist on 99% confidence on refuted fantasies. | ||
▲ | namibj 4 days ago | parent [-] | |
The problem is that all current mainstream LLMs are autoregressive decoder-only, mostly but not exclusively transformers. Their math can't apply modifiers like "this example/attempt there is wrong due to X,Y,Z" to anything that came before the modifier clause in the prompt. Despite how enticing these models are to train, these limitations are inherent. (For this specific situation people recommend going back to just before the wrong output and editing the message to reflect this understanding, as the confidently wrong output with no advisory/correcting pre-clause will "pollute the context": the model will look at the context for some aspects coded into high(-er)-layer token embeddings, inherently can't include the correct/wrong aspect because we couldn't apply the "wrong"/correction to the confidently-wrong tokens, thus retrieves the confidently-wrong tokens, and subsequently spews even more BS. Similar to how telling a GPT2/GPT3 model it's an expert on $topic made it actually be better on said topic, this affirmation of that the model made an error will prime the model to behave in a way that it gets yelled at again... sadly.) |