| ▲ | WhitneyLand 4 hours ago | |
I’d like to see embedding of actual video clips become practical in this type of workflow. Frame level embedding it covering a lot, but can miss out on a lot of action related searches. | ||
| ▲ | iliashad 26 minutes ago | parent [-] | |
Sure, I'm using (https://huggingface.co/collections/Qwen/qwen25-vl) which can help me understand action like falling down, because I can provide for example 5 frames (down scaled to 720p) to understand what is happening in this part of the video | ||