| ▲ | gordonhart 3 hours ago | |
My broad vibe is that Gemini 3.1 Pro is the best at visual/spatial tasks and oneshotting while Opus 4.6 is the best at path planning. This task leans heavily on both but maybe a little more towards planning so I'm not too shocked that Opus in narrowly on top. When running, the grids are represented in JSON, so the visual component is nullified but it still requires pretty heavy spatial understanding to parse a big old JSON array of cell values. Given Gemini's image understanding I do wonder if it would perform better with a harness that renders the grid visually. | ||