| ▲ | observationist 2 days ago | |
It's not way too early, imo. This is the academic nerds proof of concept for a school research project, it's not the "group of elite hackers get together and work out a world-class production ready system". Agent platforms have similar modes of failure, whether it's creative writing, coding, web design, hacking, or any other sort of project scaffolding. A lot of recent research has dealt with resolving the underlying gaps in architectures and training processes, and they've had great success. I fully expect frontier labs to have generalized methodologizing capabilities withing the first half of the year, and by the end of the year, the Pro/Max/Heavy variants of the chatbots will have the capabilities baked in fully. Instead of having Codex or Artemis or Claude Code, you can just ask the model to think and plan your project, whatever the domain, and get professional class results, as if an expert human was orchestrating the project. All sorts of complex visual tool use like PCB design and building plans and 3d modeling have similar process abstractions, and the decomposition and specialized task executions are very similar in principle to the generalized skills I mentioned. I think '26 is going to be exciting as hell. | ||