A (costly) way is to compare responses from different models, as they don't hallucinate in exactly the same way.