▲ | jdmoreira 4 days ago | |
Fascinating but arent some of these models multimodal? So for sure they saw the earth | ||
▲ | jdmoreira 4 days ago | parent [-] | |
The author addresses it in the text. > These are our first sizable multimodal models. You might object that this defeats the title of the post ("it's not blind!"), but I suspect current multimodality is so crude that any substantial improvement to the model's unified internal map of the world would be a miracle. Remember, we're asking it about individual coordinates, one at a time. |