| ▲ | amangsingh 4 hours ago | |||||||||||||||||||||||||||||||||||||||||||
A 500k line codebase for an agent CLI proves one thing: making a probabilistic LLM behave deterministically is a massive state-management nightmare. Right now, they're great for prompting simple sites/platforms but they break at large enterprise repos. If you don't have a rigid, external state machine governing the workflow, you have to brute-force reliability. That codebase bloat is likely 90% defensive programming; frustration regexes, context sanitizers, tool-retry loops, and state rollbacks just to stop the agent from drifting or silently breaking things. The visual map is great, but from an architectural perspective, we're still herding cats with massive code volume instead of actually governing the agents at the system level. | ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | ttcbj 2 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||
I find it really strange that there is so much negative commentary on the _code_, but so little commentary on the core architecture. My takeaway from looking at the tool list is that they got the fundamental architecture right - try to create a very simple and general set of tools on the client-side (e.g. read file, output rich text, etc) so that the server can innovate rapidly without revving the client (and also so that if, say, the source code leaks, none of the secret sauce does). Overall, when I see this I think they are focused on the right issues, and I think their tool list looks pretty simple/elegant/general. I picture the server team constantly thinking - we have these client-side tools/APIs, how can we use them optimally? How can we get more out of them. That is where the secret sauce lives. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | sunir 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
It’s not surprising. There has been quite a bit of industrial research in how to manage mere apes to be deterministic with huge software control systems, and they are an unruly bunch I assure you. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | tracyhenry 5 minutes ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
> they break at large enterprise repos. I don't know where you get this. you should ask folks at Meta. They are probably the biggest and happiest users of CC | ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | chrismarlow9 an hour ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
We propped the entire economy up on it. Just look at the s&p top 10. Actually even top 50 holdings. If it doesn't deliver on the promise we have bigger problems than "oh no the code is insecure". We went from "I think this will work" to "this has to work because if it doesn't we have one of those 'you owe the bank a billion dollars' situations" | ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | noosphr 2 minutes ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
What is going to be hilarious is rewriting that whole code base for each new version of Claude. Anyone who has been around since the gpt3 days knows that the models have very different failure modes in each generation. After learning three I don't have the energy to do it again. The code base reads like it was written in blood so for each new release you'd have months of unexpected 'opsie I deleted your whole company. I shouldn't have done that. I'm really sorry.' type events happening. | ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | comboy 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
It's hard to tell how much it says about difficulty of harnessing vs how much it says about difficulty of maintaining a clean and not bloated codebase when coding with AI. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | nicoburns 2 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
Kinda depends how much of it is vibe coded. It could easily be 5x larger than it needs to be just because the LLM felt like it if they've not been careful. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | marcuscog 36 minutes ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
I think these folks are attempting to build systems with IAM, entity states, business rules: all built over two foundational DSLs - https://typmo.com | ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | bwfan123 17 minutes ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
brute-forcing pattern-matching at scale. These are brittle systems with enormous duct-taping to hold everything together. workarounds on workarounds. | ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | whycombagator 2 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
> Right now, they're great for prompting simple sites/platforms but they break at large enterprise repos Can you expand on this? My experience is they require excessive steering but do not “break” | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | bogdanoff_2 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
What do you mean by "actually governing the agents at the system level", and how is it different from "herding cats"? | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | p-e-w 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
> A 500k line codebase for an agent CLI proves one thing: making a probabilistic LLM behave deterministically is a massive state-management nightmare. Considering what the entire system ends up being capable of, 500k lines is about 0.001% of what I would have expected something like that to require 10 years ago. You can combine that with all the training and inference code, and at the end of the day, a system that literally writes code ends up being smaller than the LibreOffice codebase. It boggles the mind, really. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | quantumquantara 2 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
[dead] | ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | dolomo 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||
[flagged] | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | ramesh31 2 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||
>A 500k line codebase for an agent CLI proves one thing: making a probabilistic LLM behave deterministically is a massive state-management nightmare. Right now, they're great for prompting simple sites/platforms but they break at large enterprise repos. Is that the case? I'm pretty sure Claude Code is one of the most massively successful pieces of software made in the last decade. I don't know how that proves your point. Will this codebase become unmanageable eventually? Maybe, but literally every agent harness out there is just copying their lead at this point. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||