I also don’t know for certain, but I’d assume they only cache AI responses at an (at most) regional level, and only for a fairly short timeframe depending on the kind of site. They already had mechanisms for detecting changes and updating their global search index quickly. The AI stuff likely relies mostly on that existing system.
This seems more like a model-specific issue, where it’s consistently generating flawed output every time the cache gets invalid. If that’s the case, there’s not much Google can do on a case-by-case level, but we should see improvements over time as the model gets incrementally better / it becomes more financially viable to run better models at this scale.