| ▲ | joshribakoff 8 hours ago | |||||||
I’ve been doing game development and it starts to hallucinate more rapidly when it doesn’t understand things like the direction it placing things or which way the camera is oriented Gemini models are a little bit better about spatial reasoning, but we’re still not there yet because these models were not designed to do spatial reasoning they were designed to process text In my development, I also use the ascii matrix technique. | ||||||||
| ▲ | kleene_op 8 hours ago | parent | next [-] | |||||||
Spatial awareness was also a huge limitation to Claude playing pokemon. It really seems to me that the first AI company getting to implement "spatial awareness" vector tokens and integrating them neatly with the other conventional text, image and sound tokens will be reaping huge rewards. Some are already partnering with robot companies, it's only a matter of time before one of those gets there. | ||||||||
| ||||||||
| ▲ | hypercube33 7 hours ago | parent | prev | next [-] | |||||||
I disagree. With opus I'll screenshot an app and draw all over it like a child with me paint and paste it into the chat - it seems to reasonably understand what I'm asking with my chicken scratch and dimensions. As far as 3d I don't have experience however it could be quite awful at that | ||||||||
| ||||||||
| ▲ | miohtama 8 hours ago | parent | prev [-] | |||||||
They would need a spatial reason or layout specific tool, to translate to English and back | ||||||||
| ||||||||