We didn't test using post-hoc reasoning. Instead, we focused on checking whether specific, obscure questions could be recognized or identified in any way, using various ad-hoc methods to see if the answers could be surfaced without relying on reasoning.

It's very difficult to prove either way (and basically impossible without the model weights), but we're reasonably confident that there's no significant prior knowledge of the questions that would affect the score.

▲

mr_wiglaf 3 days ago | parent [-]

I'm new to this sort of inquiry. What do you do to see if questions can be recognized? Do you just ask/prompt "do you recognize this puzzle?"

What does it mean for it to "be surfaced without relying on reasoning"?

	▲	scrollaway 3 days ago \| parent [-]
		> Do you just ask/prompt "do you recognize this puzzle?" In essence, yes, but with a bit more methodology (though as I mentioned it was all ad-hoc). We've tried to extract pre-existing questions as well through a variety of "You are a contestant on the british TV show Only Connect" and see if it can recognize questions - couldn't find anything that reliably reproduced preexisting knowledge. It's absolutely possible we missed something.