I strongly suspect it's the latter, though someone please chime in if I'm wrong.
Even so, this is a real advancement. It's impressive to see existing techniques combined to meaningfully improve on SOTA image generation.