My point wasn't the delta in work between video and text generation. It was that the degradation of a prompt is much more visible (because: literal). But, generally agree on the research/dev part.