| ▲ | thomastjeffery 8 months ago | |||||||
There is no such thing as "thing" here. These models are trained such that the given conditions (the visual input and the text prompt) will be continued with a desirable continuation (motor function over time). The only dimension accuracy can apply to is desirability. | ||||||||
| ▲ | jayd16 8 months ago | parent [-] | |||||||
You don't think there's any segmentation going on? | ||||||||
| ||||||||