Better source validation is one of the main reasons I'm excited about GPT-5 Thinking for this. It would be interesting to try your Gemini prompts against that and see how the results compare.

▲

Hugsun 4 days ago | parent [-]

I've found GPT-5 Thinking to perform worse than o3 did in tasks of a similar nature. It makes more bad assumptions that de-rail the train of thought.

	▲	3abiton 3 days ago \| parent [-]
		I think the key is prompting, and bound boxing assumptions.