▲ | ryeguy_24 2 days ago | |
Agree. Also, with respect to training, what is the goal that we are maximizing? LLMs are easy, predicting the next word and we have lots of training data. But what are we training for in real world? Modeling the next spatial photograph to predict things that will happen next? It’s not intuitive to me what that objective function would be in spatial intelligence. | ||
▲ | kadushka 2 days ago | parent | next [-] | |
Why wouldn’t predicting the next frame in a video stream be as effective as predicting the next word? | ||
▲ | curiouscavalier 2 days ago | parent | prev [-] | |
Or that there is a sufficiently generalizable objective function for all “spatial intelligence.” |