Show HN: DeepDream for Video with Temporal Consistency

I forked a PyTorch DeepDream implementation and added video support with temporal consistency. It produces smooth DeepDream videos with minimal flickering, and is highly flexible including many parameters and supports multiple pretrained image classifiers including GoogLeNet. Check out the repo for sample videos! Features:

- Optical flow warps previous hallucinations into the current frame

- Occlusion masking prevents ghosting and hallucination transfer when objects move

- Advanced parameters (layers, octaves, iterations) still work

- Works on GPU, CPU, and Apple Silicon

▲

reactordev a day ago | parent | next [-]

Reminds me of my first acid trip.

▲

noobcoder a day ago | parent | prev | next [-]

I remember back in 2018 we used do FFmpeg split clips into frames, hit each with GoogLeNet gradient ascent on layers thenn blended prev frame for crude smoothing

	▲	embedding-shape a day ago \| parent [-]
		SOTA for frame interpretation today is probably RIFE (https://github.com/hzwer/ECCV2022-RIFE) as far as I know, which is fast as hell as well, and really good results still. But it's already 4 years old now, anyone know if there is anything better than RIFE for this sort of stuff today?

▲

DustinBrett a day ago | parent | prev | next [-]

Looking at that video makes me sick.

▲

kieojk a day ago | parent | prev | next [-]

As the name of the model suggests, the generated videos are full of dreams.

▲

dudefeliciano a day ago | parent | prev | next [-]

looks cool! i see the classic dog faces when generating video, is it possible to use own images for the style of the output video?

▲

echelon a day ago | parent | prev [-]

This is a trip down memory lane!

I remember when DeepDream first came out, and WaveNet not long after. I was immediately convinced this stuff was going to play a huge role in media production.

I'm a big hobbyist filmmaker. I told all of my friends who actually work in film (IATSE, SAG, etc.) and they were so skeptical. I tried to get them to make an experimental film using DeepDream.

This was about the same time Intel was dabbling in 360 degree filmmaking and just prior to Epic Games / Disney working on "The Volume".

I bought a bunch of Kinects and built a really low-fidelity real time version of what Intel was working on. The sensors are VGA resolution, so it's not at all cinematic.

When Stable Diffusion came out, I hooked up Blender to image-to-image and fed it frames of previz animations to convert to style transferred anime. Then IP Adapter. Then Animate Diff.

My friends were angry with me at this point. AI was the devil. But I kept at it.

I built an animation system for the web three years ago. Nonlinear timeline editing, camera controls, object transformations. It was really crude and a lot of work to produce simple outputs: https://storyteller.ai/

It was ridiculously hard to use. I typically film live action for the 48 Hour Film Project (twice annual film "hackathon" that I've done since I was a teenager). I used Mocap suits and 3D animation, and this is the result of 48 hours of no sleep:

https://vimeo.com/955680517/05d9fb0c4f

We won two awards for this. The audience booed us.

The image-to-video models came out right after this and immediately sunk this approach. Luma Dream Machine was so easy and looked so much better. Starting frames are just like a director and DP blocking out a scene and then calling action - it solved for the half of the problem I had ignored, which was precisely controlling for look/feel (though this abandons temporal control).

There was a lot of slop, but I admired the work some hard-working people were creating. Those "movie trailers" people were criticizing were easily 10 hours of work with the difficulty of the tech back then.

I found use in model aggregation services like OpenArt and FreePik. ComfyUI is too far removed for me - I appreciate people who can do node magic, but it's not my thing.

I've been working on ArtCraft ( https://github.com/storytold/artcraft ), which is a more artist-centered version for blocking out and precisely articulating scenes.

My friends and I have been making a lot of AI films, and it's almost replaced our photons-on-glass filmmaking output. (We've done some rotoscoped AI + live action work.)

https://www.youtube.com/watch?v=Tii9uF0nAx4 (live action rotoscoped film)

https://www.youtube.com/watch?v=v_2We_QQfPg (EbSynth sketch about The Predator)

https://www.youtube.com/watch?v=tAAiiKteM-U (Robot Chicken inspired Superman parody)

https://www.youtube.com/watch?v=oqoCWdOwr2U (JoJo grinch parody)

We're going to do a feature length film at some point, but we're still building up the toolbox.

If you're skeptical about artists using AI, you should check out Corridor Crew. They're well respected in our field, they have been for over a decade, and they love AI:

https://en.wikipedia.org/wiki/Corridor_Digital

https://www.youtube.com/watch?v=DSRrSO7QhXY

https://www.youtube.com/watch?v=GVT3WUa-48Y

https://www.youtube.com/watch?v=iq5JaG53dho

They're big ComfyUI fans. I just can't get used to it.

Real filmmakers and artists are using this tech now. If you hate AI, please know that we see this more as an exoskeleton than as a replacement. It enables us to reach the look and feel of a $100+ million dollar Pixar, Star Wars, or Marvel film without the budgets we could never have without insane luck or nepotism.

If anything, this elevates us to a place where we will one day be competing with Disney. They should fear us instead of the other way around.

▲

chabes 21 hours ago | parent | next [-]

The artist Panda Bear has a music video from this time called Crosswords. Uses deepdream about halfway through. I thought it was pretty groundbreaking when it came out a decade ago. Seems much more tame now, by today’s standards, but still gives me the feels

Edit: link https://m.youtube.com/watch?v=2EXslhx989M&pp=ygUUcGFuZGEgYmV...

▲

xg15 a day ago | parent | prev | next [-]

> It enables us to reach the look and feel of a $100+ million dollar Pixar, Star Wars, or Marvel film without the budgets we could never have without insane luck or nepotism.

I can understand filmmakers wanting this, but less so audiences.

The problem is that then everything will look like Marvel or Pixar or Star Wars.

The other problem is that as audience, I now can never be sure if a detail was put somewhere intentionally by the filmmakers or if it was just a random AI addition.

	▲	topocite 4 hours ago \| parent \| next [-]
		I just don't agree everything will look like Marvel or Star Wars. Everything looks like that now because the loss of DVD sales has altered the economics of movies so drastically. These tools should ultimately empower a golden age of experimentation and unique film making. I know for myself, I will absolutely be into this some day but it is just far too early. For me, I can see the potential of AI generated video but that is something for the 2030s. Right now it is like the digital audio workstation in 1992. It needs another decade to mature.
	▲	echelon a day ago \| parent \| prev [-]
		> everything will look like Marvel or Pixar or Star Wars. I want to immediately push back against that notion. I cited Disney because that's what people are familiar with. In reality, we're going to have more diversity than ever before. Stuff like this: https://www.reddit.com/r/aivideos/comments/1q22s3s/go_slowly... https://www.linkedin.com/feed/update/urn:li:activity:7409063... https://www.youtube.com/watch?v=9hlx5Rslrzk These are bad examples, but they're just what I have in front of me. "I've seen things you people wouldn't believe" is a fitting quote. There's going to be incredible stylistic diversity. > I now can never be sure if a detail was put somewhere intentionally by the filmmakers or if it was just a random AI addition. If it's in the film, it was intentional. What you won't know is whether it was serendipity. And the fact is quite a bit of filmmaking today is already serendipity.

▲

manbart a day ago | parent | prev | next [-]

I really enjoyed "Take Out," I can't believe people booed it! I especially liked the fortune cookie scene

▲

echelon a day ago | parent [-]

Thank you so much!

It was gut-wrenching to hear after a weekend of sleepless effort. This was a community I've been a part of for decades, and I've never had my work received like that before.

But at the same time, this was right after the strikes and just as AI was looking like it might threaten jobs. I totally and completely understand these folks - most of whom are actively employed in the industry - feeling threatened and vulnerable.

In the past two years the mood has definitely lightened. More filmmakers are incorporating AI into their work and it's becoming accepted now. Of the films that used AI in last October's 48 Hour Film Project, none of them were booed, and a few of them won awards.

We animated all of the gore in ours with AI (warning - this is bloody and gory) :

https://drive.google.com/file/d/1m6eUR5V55QXA9p6SLuCM8LdFMBq...

(This link will get rate limited - sorry. An untold volume of indie films are on the "deep web" as Google Drive links.)

We really had to rush on this one. We only had 24 hours due to a scheduling snafu and had to roll with the punches. The blood didn't do exactly what we wanted, and we had to keep the script super tight. Limited location, serendipitous real party we crashed with a filmmaking request, ...

These are like hackathons and game jams.

	▲	xrd 21 hours ago \| parent [-]
		I have to say, I'm enthralled by your two comments and what you've shared. My favorite director is Richard Linklater, mainly because his storytelling is incredible, but at a close second is the way he has pushed the boundaries of narrative with technology like rotoscoping (and which you referenced). This is a fascinating thread. I'm definitely a fan of your work.

▲

RIMR 17 hours ago | parent | prev | next [-]

I always thought that these models were going to revolutionize video compression. You would have something like a ~10GB compression model, and you could compress a 4K movie down to 500MB or something playable by anyone else with the model.

Maybe that's still going to happen?

	▲	echelon 32 minutes ago \| parent [-]
		I think so. I wouldn't be surprised if one day we could compress to just the semantics of the film - characters, movement, lighting, props, textures, etc. and then "re-render" on the other side of the pipe. If not for film and video, I think this will be used for "world models" / interactive video models that are playable "video games" / Holodecks. This will probably come much later down the road as there are so many other problems to work on before "video models are also compression".

▲

CyberDildonics a day ago | parent | prev | next [-]

you should check out Corridor Crew. They're well respected in our field,

In what field? They're youtubers, none of them have held a single job in visual effects. They make react videos. They aren't in effects, their industry is youtube.

▲

seanw444 21 hours ago | parent | next [-]

And they're pretty good at what they do.

▲

CyberDildonics 21 hours ago | parent [-]

What they do is react videos to things they don't understand, so it would be crazy to look to them as vfx leaders when they haven't even had an entry level job.

That would be like betting on someone who watches boxing every weekend to win a professional fight.

	▲	seanw444 21 hours ago \| parent [-]
		I don't know what their current video catalog looks like, but they used to do a lot of fairly impressive VFX videos themselves about a decade ago.

▲

imiric 19 hours ago | parent | prev [-]

That's flat out wrong. They have produced several web series, and their videos feature a lot of visual effects. Just because their career focuses on producing web content doesn't mean they're any less talented than someone working on feature films.

I can't comment on whether they're "well respected" in the VFX industry, but you're being misleadingly hostile.

▲

CyberDildonics 19 hours ago | parent [-]

I can't comment on whether they're "well respected" in the VFX industry,

They aren't, because they aren't in the vfx industry.

you're being misleadingly hostile.

No, this is honesty. People who only know vfx through fake youtubers want to defend them, but it's the blind leading the blind for clicks and views.

Just because their career focuses on producing web content doesn't mean they're any less talented than someone working on feature films.

They built their channel criticizing people who work on feature films. Their work is good according to them and acolytes who buy into it, but people who think they represent vfx don't realize this and suddenly it isn't fair to point out the truth.

▲

imiric 18 hours ago | parent [-]

> No, this is honesty.

No, it's bullshit.

From Wikipedia[1]:

> Corridor Digital LLC is an American independent production studio based in Los Angeles, known for creating pop-culture-related viral online short-form videos since 2010, as well as producing and directing the Battlefield-inspired web series Rush and the YouTube Premium series Lifeline. It has also created television commercials for various companies, including Machine Zone and Google.

You clearly have some bone to pick with them, but they're accomplished VFX artists. Whether they're good or not is a separate matter, but they're not "fake youtubers" or misleading anyone, unlike yourself.

[1]: https://en.wikipedia.org/wiki/Corridor_Digital

	▲	CyberDildonics 16 hours ago \| parent [-]
		Making web videos doesn't mean that they are able to do the visual effects that they criticize. They don't make "vfx artists react" videos to low grade web series. They call themselves vfx artists when they have never done that. They make web videos, they criticize professional work and you are completely ignoring that.

▲

krapp 17 hours ago | parent | prev [-]

Why would Disney fear you, though?

They'll still own all of the money-making IP that your models are trained on. You won't actually be able to get away with "X in the style of Disney or Y in the style of Pixar." And they'll have their own in-house models, far more expensive and powerful (likely through regulation) than you'll be able to afford.

You aren't talking about competition, you're talking about emulation. Putting an inferior version of existing properties on the market like The Asylum does.

I watched all of the links you provided. The "live action rotoscoped film" didn't need AI. They could have rented a knight costume or did that in AE, it wouldn't have even been that expensive. It wasn't even a good example of what rotoscoping could do - and I can actually accept rotoscoping and motion capture as a legitimate use of LLMs.

The concept for the Predator sketch was lame and it's ripping off the style of an Joel Haver, an animator with actual comedy talent. That legitimately kind of makes me mad. That's exactly the sort of thing that makes people hate AI. You aren't even ripping off a corporation here, which I could at least respect in abstract.

The Superman parody was generic and derivative, but it looked competent. Didn't really look like Robot Chicken though. The way they did the mouths was weird. And the whole plot revolved around a problem that was never actually a problem in the franchise.

The Grinch "anime" didn't look like a good anime. It looked like the kind of thing people criticize when an anime studio cuts costs for a second season. Still frames and very little animation. Inconsistent and generic style.

The horror movie posted below? The "blood" looked awful a lot of the time. The cinematography, as such, didn't carry the story at all. It isn't shot or cut the way an actual movie would be. The actors weren't compelling, the script was tepid.

Understand, I'm really trying not to just shit on this because it's AI, I'm trying to approach it as art because that's what it purports to be. And I can concede that the technical capability of AI has advanced dramatically in this field. They did a Robot Chicken and they did a Joel Haver and I saw a 90s cartoon but... it's bad art. I see no sense of actual unique creative vision anywhere. No sense of an actual artist trying to express something. Nothing that tells me "this could only have been done by AI, and would have been impossible beforehand."

It's like AI people think all you need is the aesthetic, and that the aesthetic they're imitating is a suitable replacement for the actual talent that went into it, but it isn't.