Remix.run Logo
IshKebab 4 days ago

Funny how it really wants people to dance. Even the guy sitting down for an interview just starts dancing sitting down.

jonas21 4 days ago | parent | next [-]

Presumably they're dancing because it's in the prompt. You could change the prompt to have them do something else (but that would be less fun!)

IshKebab 4 days ago | parent [-]

I'm no expert but are you sure there is a prompt?

dragonwriter 4 days ago | parent [-]

Yes, while the page here does not directly mention the prompts, the linked paper does, and the linked code repo shows that prompts are used as well.

vunderba 3 days ago | parent | next [-]

100%. I don't think I've ever even come across an I2V model that didn't require at least a positive prompt. Some people get around it by integrating a vision LLM into their ComfyUI workflows however.

IshKebab 3 days ago | parent | prev [-]

Ah yeah you're right - they seem to just really like giving dancing prompts. I guess they work well due to the training set.

Jaxkr 4 days ago | parent | prev | next [-]

Massive open TikTok training set lots of video researchers use

bravura 4 days ago | parent | prev [-]

It's a peculiar and fascinating observation you make.

With static images, we always look for eyes.

With video, we always look for dancing.