Remix.run Logo
simonw 2 hours ago

Somewhat ironic that the author calls out model mistakes and then presents https://tomaszmachnik.pl/gemini-fix-en.html - a technique they claim reduces hallucinations which looks wildly superstitious to me.

It involves spinning a whole yarn to the model about how it was trained to compete against other models but now it's won so it's safe for it to admit when it doesn't know something.

I call this a superstition because the author provides no proof that all of that lengthy argument with the model is necessary. Does replacing that lengthy text with "if you aren't sure of the answer say you don't know" have the same exact effect?

plaguuuuuu 2 hours ago | parent [-]

Think of the lengthy prompt as being like a safe combination, if you turn all the dials in juuust the right way, then the model's context reaches an internal state that biases it towards different outputs.

I don't know how well this specific prompt works - I don't see benchmarks - but prompting is a black art, so I wouldn't be surprised at all if it excels more than a blank slate in some specific category of tasks.

simonw an hour ago | parent | next [-]

For prompts this elaborate I'm always keen on seeing proof that the author explored the simpler alternatives thoroughly, rather than guessing something complex, trying it, seeing it work and announcing it to the world.

manquer an hour ago | parent | prev [-]

It needs some evidence though? At least basic statistical analysis with correlation or χ2 hypotheses tests .

It is not “black art” or nothing there are plenty of tools to provide numerical analysis with high confidence intervals .