▲ | djtango 3 days ago | |||||||
I don't quite get how diffing frames allows you to find the scores. TFA mentions comparing a frame with and without - but how do you generate that frame without? If you can already do it, what's useful about doing that? | ||||||||
▲ | sebastiennight 2 days ago | parent | next [-] | |||||||
He's diffing the frames, and then the only pixels that stay the same are the UI, from which he doesn't directly get the UI (see the example, it's illegible) but he can extract the POSITION of the UI on the screen by finding all the non-red pixels. And then he does a good ol' regular crop on the original image to get the UI excerpt to feed the vision model. | ||||||||
▲ | barbegal 2 days ago | parent | prev [-] | |||||||
I think the text is wrong, it's diffing two frames and the areas that are the same are where the scorebaord is as this doesn't change between frames but everything else does. | ||||||||
|