|
| ▲ | kleene_op 8 hours ago | parent | next [-] |
| Spatial awareness was also a huge limitation to Claude playing pokemon. It really seems to me that the first AI company getting to implement "spatial awareness" vector tokens and integrating them neatly with the other conventional text, image and sound tokens will be reaping huge rewards.
Some are already partnering with robot companies, it's only a matter of time before one of those gets there. |
| |
| ▲ | nszceta 7 hours ago | parent [-] | | This is also my experience with attempting to use Claude and GLM-4.7 with OpenSCAD. Horrible spatial reasoning abilities. |
|
|
| ▲ | hypercube33 7 hours ago | parent | prev | next [-] |
| I disagree. With opus I'll screenshot an app and draw all over it like a child with me paint and paste it into the chat - it seems to reasonably understand what I'm asking with my chicken scratch and dimensions. As far as 3d I don't have experience however it could be quite awful at that |
| |
| ▲ | vunderba 4 minutes ago | parent [-] | | Yeah at least for 2D, Opus 4.5 seems decent. It can struggle with finer details, so sometimes I’ll grab a highlighter tool in Photoshop and mark the points of interest. |
|
|
| ▲ | miohtama 8 hours ago | parent | prev [-] |
| They would need a spatial reason or layout specific tool, to translate to English and back |
| |
| ▲ | falcor84 7 hours ago | parent [-] | | I wonder if they could integrate a secondary "world model" trained/fine-tuned on Rollercoaster Tycoon to just do the layout reasoning, and have the main agent offload tasks to it. |
|