| ▲ | lambdaloop 2 hours ago | |
This is fascinating! Having a really strong video encoder model and then a simpler decoder from that reminds me of the recent D4RT from DeepMind as well: https://d4rt-paper.github.io/ I think we'll see more of these video encoder models in the coming years, they truly seem like magic. | ||