Remix clone Hacker News

new | show | ask | jobs Github

	▲	krackers 5 hours ago
		>progressively understands the business This is no different than onboarding a new member of the team, and I think openAI was working on that "frontier" >We started by looking at how enterprises already scale people. They create onboarding processes. They teach institutional knowledge and internal language. They allow learning through experience and improve performance through feedback. They grant access to the right systems and set boundaries. AI coworkers need the same things. And tribal knowledge will not be a moat once execs realize that all they need to do is prioritize documentation instead of "code velocity" as a metric (sure any metric gets goodhearted, but LLMs are great at sifting through garbage to find the high perplexity tokens). >But context limitation is fundamental to the technology in its current form This may not be the case, large enough context-windows plus external scratchpads would mostly obviate the need for true in context learning. The main issue today is that "agent harnesses" suck. The fact that claude code is considered good is more an indication of how bad everything else is. Tool traces read like a drunken newb brute-forcing his way through tasks. LLMs can mostly "one-shot" individual functions, but orchestrating everything is the blocker. (Yes there's progress in metr or whatever but I don't trust any of that, else we'd actually see the results in real-world open source projects). LLMs don't really know how to interact with subagents. They're generally sort of myopic even with tool calls. They'll spend 20 minutes trying to fix build issues going down a rabbit hole without stepping back to think. I think some sort of self-play might end up solving all of these things, they need to develop a "theory of mind" in the same way that humans do, to understand how to delegate and interact with the subagents they spawn. (Today a failure case is agents often don't realize subagents don't share the same context.) Some of this is certainly in the base model and pretraining, but it needs to be brought out in the same way RL was needed for tool use.