| ▲ | Zee2 14 hours ago | |||||||
Seems like the search is based only on the transcript/dialogue - not an image embedding. Would be super cool to actually use some CLIP/embedding search on these for a more effective fuzzy lookup. | ||||||||
| ▲ | petercooper 4 hours ago | parent | next [-] | |||||||
Agreed. If you search for Barney, say, none of the top ten picture him at all and is mostly people speaking to or about him. Even running them through a vision LLM for a list of keywords would yield better results than the subtitles, I suspect. | ||||||||
| ▲ | adzm 12 hours ago | parent | prev [-] | |||||||
How would someone go about doing this, just curious? | ||||||||
| ||||||||