▲ | saejox 14 hours ago | ||||||||||||||||
What Carmack is doing is right. More people need to get away from training their models just with words. AI need the physicality. | |||||||||||||||||
▲ | johnb231 3 hours ago | parent | next [-] | ||||||||||||||||
> More people need to get away from training their models just with words. They started doing that a couple of years ago. The frontier "language" models are natively multimodal, trained on audio, text, video, images. That is all in the same model, not separate models stitched together. The inputs are tokenized and mapped into a shared embedding space. Gemini, GPT-4o, Grok 3, Claude 3, Llama 4. These are all multimodal, not just "language models". | |||||||||||||||||
| |||||||||||||||||
▲ | NL807 9 hours ago | parent | prev | next [-] | ||||||||||||||||
>AI need the physicality. which i found interesting, because i remember Carmack saying simulated environments are way forward and physical environments are too impractical for developing AI | |||||||||||||||||
| |||||||||||||||||
▲ | programd 8 hours ago | parent | prev [-] | ||||||||||||||||
Nvidia seems to think the same thing. Here's Jim Fan talking about a "physical Turing test" and how embodied AI is the way forward. https://www.youtube.com/watch?v=_2NijXqBESI He also talks needing large amounts of compute to run the virtual environments where you'll be training embodied AI. Very much worth watching. |