Remix clone Hacker News

new | show | ask | jobs Github

	▲	daxfohl a day ago
		This was from a few months ago, so things may be different now. I only used OpenAI, and the o3 model did by far the best (gpt-4o's performance was equivalent on the basic scenario when I had it just move one move at a time (which, it was still pretty good, all considered), but when I started having it summarize state and such, o3 was able to use that to improve performance, whereas 4o actually got worse). But yeah, that's one of the things I tried. "Your turn is over. Please summarize everything you have learned about the maze so someone else can pick up where you left off". It did okay, but it often included superfluous information, it sometimes forgot to include current orientation (the maze action options were "move forward", "turn right", "turn left", so knowing the current orientation was important), and it always forgot to include instructions on how to interpret the state: in particular, which absolute direction corresponded to an increase or decrease of which grid index. I even tried to coax it into defining a formal state representation and "instructions for an LLM to use it" up-front, to see if it would remember to include the direction/index correspondence, but it never did. It was amusing actually; it was apparent it was just doing whatever I told it and not thinking for itself. Something like "Do you think you should include a map in the state representation? Would that be useful?" "Yes, great idea! Here is a field for a map, and an algorithm to build it" "Do you think a map would be too much information?" "Yes, great consideration! I have removed the map field" "No, I'm asking you. You're the one that's going to use this. Do you want a map or not?" "It's up to you! I can implement it however you like!"