Remix.run Logo
lottaFLOPS 5 days ago

related research that was also announced this week: https://www.textquests.ai/

kqr 5 days ago | parent | next [-]

They seem to be going for a much simpler route of just giving the LLM a full transcript of the game with its own reasoning interspersed. I didn't have much luck with that, and I'm worried it might not be effective once we're into the hundreds of turns because of inadvertent context poisoning. It seems like this might indeed be what happens, given the slowing of progress indicated in the paper.

1970-01-01 5 days ago | parent | prev | next [-]

Very interesting how they all clearly suck at it. Even with hints, they can't understand the task enough to complete the game.

abraxas 5 days ago | parent | prev [-]

that's a great tracker. How often is the laderboard updated?