| ▲ | falcor84 8 hours ago | |
But that's the thing: Claude Plays Pokemon is an experiment in having Claude work fully independently, so there's no "you" who would improve its onboarding docs or anything else, it has to do so on its own. And as long as it cannot do so reliably, it effectively has anterograde amnesia. And just to be clear, I'm mentioning this because I think that Claude Plays Pokemon is a playground for any agentic AI doing any sort of long-term independent work; I believe that the solution needed here is going to bring us closer to a fully independent agent in coding and other domains. It reminds me of the codeclash.ai benchmark, where similar issues are seen across multiple "rounds" of an AI working on the same codebase. | ||
| ▲ | skybrian 6 hours ago | parent [-] | |
Sure, it's not close to fully independent. But I was interpreting "much, much less employable" as not very useful for programming in its current state, and I think it is quite useful. | ||