▲ | kaibee 11 hours ago | |
So? Any video generation model must necessarily be able to do this. (consider the case of generating a pan-over of a chess board where the starting input frame is only the first pawn and rook, the model should know to generate the rest of the pieces in the style of input pieces) |