Remix.run Logo
mohsen1 9 hours ago

src/cli/print.ts

This is the single worst function in the codebase by every metric:

  - 3,167 lines long (the file itself is 5,594 lines)
  - 12 levels of nesting at its deepest
  - ~486 branch points of cyclomatic complexity
  - 12 parameters + an options object with 16 sub-properties
  - Defines 21 inner functions and closures
  - Handles: agent run loop, SIGINT, rate-limits, AWS auth, MCP lifecycle, plugin install/refresh, worktree bridging, team-lead polling (while(true) inside), control message dispatch (dozens of types), model switching, turn interruption
  recovery, and more
This should be at minimum 8–10 separate modules.
mohsen1 7 hours ago | parent | next [-]

here's another gem. src/ink/termio/osc.ts:192–210

  void execFileNoThrow('wl-copy', [], opts).then(r => {
    if (r.code === 0) { linuxCopy = 'wl-copy'; return }
    void execFileNoThrow('xclip', ...).then(r2 => {
      if (r2.code === 0) { linuxCopy = 'xclip'; return }
      void execFileNoThrow('xsel', ...).then(r3 => {
        linuxCopy = r3.code === 0 ? 'xsel' : null
      })
    })
  })

are we doing async or not?
visarga 3 hours ago | parent | next [-]

Claude Code says thank you for reporting, I bet they will scan this chat to see what bugs they need to fix asap.

almostdeadguy 3 hours ago | parent | prev | next [-]

A defining work of the "just vibes" era.

mrcwinn an hour ago | parent [-]

You fail to mention the prior decades of really bad software engineers writing awful code -- off of which these models trained.

almostdeadguy an hour ago | parent [-]

Yes, anthropic is not the only company in the world with some shitty code, and yet I feel no pangs of guilt over laughing about it.

sudo_man 7 hours ago | parent | prev [-]

LOOOOOOOOOOL

novaleaf 6 hours ago | parent | prev | next [-]

I'm sure this is no surprise to anyone who has used CC for a while. This is the source of so many bugs. I would say "open bugs" but Anthropic auto-closes bugs that don't have movement on them in like 60 days.

0xbadcafebee 3 hours ago | parent | prev | next [-]

> This should be at minimum 8–10 separate modules.

Can't really say that for sure. The way humans structure code isn't some ideal best possible state of computer code, it's the ideal organization of computer code for human coders.

Nesting and cyclomatic complexity are indicators ("code smells"). They aren't guaranteed to lead to worse outcomes. If you have a function with 12 levels of nesting, but in each nest the first line is 'return true', you actually have 1 branch. If 2 of your 486 branch points are hit 99.999% of the time, the code is pretty dang efficient. You can't tell for sure if a design is actually good or bad until you run it a lot.

One thing we know for sure is LLMs write code differently than we do. They'll catch incredibly hard bugs while making beginner mistakes. I think we need a whole new way of analyzing their code. Our human programming rules are qualitative because it's too hard to prove if an average program does what we want. I think we need a new way to judge LLM code.

The worst outcome I can imagine would be forcing them to code exactly like we do. It just reinforces our own biases, and puts in the same bugs that we do. Vibe coding is a new paradigm, done by a new kind of intelligence. As we learn how to use it effectively, we should let the process of what works develop naturally. Evolution rather than intelligent design.

zarzavat 2 hours ago | parent | next [-]

I don't buy this. Claude doesn't usually have any issues understanding my code. It has tons of issues understanding its code.

The difference between my code and Claude's code is that when my code is getting too complex to fit in my head, I stop and refactor it, since for me understanding the code is a prerequisite for writing code.

Claude, on the other hand, will simply keep generating code well past the point when it has lost comprehension. I have to stop, revert, and tell it to do it again with a new prompt.

If anything, Claude has a greater need for structure than me since the entire task has to fit in the relatively small context window.

crakhamster01 an hour ago | parent | prev | next [-]

> One thing we know for sure is LLMs write code differently than we do.

Kind of. One thing we do know for certain is that LLMs degrade in performance with context length. You will undoubtedly get worse results if the LLM has to reason through long functions and high LOC files. You might get to a working state eventually, but only after burning many more tokens than if given the right amount of context.

> The worst outcome I can imagine would be forcing them to code exactly like we do.

You're treating "code smells" like cyclomatic complexity as something that is stylistic preference, but these best practices are backed by research. They became popular because teams across the industry analyzed code responsible for bugs/SEVs, and all found high correlation between these metrics and shipping defects.

Yes, coding standards should evolve, but... that's not saying anything new. We've been iterating on them for decades now.

I think the worst outcome would be throwing out our collective wisdom because the AI labs tell us to. It might be good to question who stands to benefit when LLMs aren't leveraged efficiently.

FuckButtons 3 hours ago | parent | prev | next [-]

I’ve heard this take before, but if you’ve spent any time with llm’s I don’t understand how your take can be: “I should just let this thing that makes mistakes all the time and seems oblivious to the complexity it’s creating because it only observes small snippets out of context make it’s own decisions about architecture, this is just how it does things and I shouldn’t question it.”

meffmadd 3 hours ago | parent | prev [-]

I think this view assumes no human will/should ever read the code. This is considered bad practice because someone else will not understand the code as well whether written by a human or agent. Unless 0% human oversight is needed anymore agents should still code like us.

jollymonATX 4 hours ago | parent | prev | next [-]

Maybe going slow is a feature for them? A kind of rate limit by bad code way to controlling overall throughput.

ykonstant 5 hours ago | parent | prev | next [-]

"That's Larry; he does most of the work around here."

dwa3592 5 hours ago | parent [-]

lmao

epolanski 2 hours ago | parent | prev | next [-]

Hmmm it's likely they have found that it works better for LLMs that need to operate on it.

DustinBrett 5 hours ago | parent | prev | next [-]

"You can get Claude to split that up"

keeganpoppen 4 hours ago | parent | prev | next [-]

the claude code team ethos, as far as i’ve been lead to understand— which i agree with, mind you— is that there is no point in code-reviewing ai-generated code… simply update your spec(s) and regenerate. it is just a completely different way of interacting with the world. but it clearly works for them, so people throwing up their hands should at least take notice of the fact that they are absolutely not competing with traditional code along traditional lines. it may be sucky aesthetically, but they have proven from their velocity that it can be extremely effective. welcome to the New World Order, my friend.

knome 3 hours ago | parent | next [-]

>there is no point in code-reviewing ai-generated code

the idea that you should just blindly trust code you are responsible for without bothering to review it is ludicrous.

jen20 2 hours ago | parent | next [-]

(I mostly agree with you, but) devils advocate: most people already do that with dependencies, so why not move the line even further up?

batshit_beaver 2 hours ago | parent | next [-]

Because you trust that your dependencies are not vibe coded and have been reviewed by humans.

bdangubic an hour ago | parent [-]

except they are vibe-or-not coded by some dude in Reno NV who wouldn’t pass a phone screen where you work

almostdeadguy an hour ago | parent | prev [-]

There's a reputational filtering that happens when using dependencies. Stars, downloads, last release, who the developer is, etc.

Yeah we get supply chain attacks (like the axios thing today) with dependencies, but on the whole I think this is much safer than YOLO git-push-force-origin-main-ing some vibe-coded trash that nobody has ever run before.

I also think this isn't really true for the FAANGs, who ostensibly vendor and heavily review many of their dependencies because of the potential impacts they face from them being wrong. For us small potatoes I think "reviewing the code in your repository" is a common sense quality check.

eclipxe 2 hours ago | parent | prev [-]

Why?

fl4regun 2 hours ago | parent [-]

Is this a serious question? If you are handling sensitive information how do you confirm your application is secure and won't leak or expose information to people who shouldn't know it?

lijok 34 minutes ago | parent [-]

How do you with classic code?

lqstuart an hour ago | parent | prev | next [-]

yes, because who ever heard of an AI leaking passwords or API keys into source code

lanbin 3 hours ago | parent | prev | next [-]

I see. They got unlimited tokens, right?

Salgat 3 hours ago | parent | prev [-]

While the technology is young, bugs are to be expected, but I'm curious what happens when their competitors' mature their product, clean up the bugs and stabilize it, while Claude is still kept in this trap where a certain number of bugs and issues are just a constant fixture due to vibe coding. But hey, maybe they really do achieve AGI and get over the limitations of vibe coding without human involvement.

mohsen1 6 hours ago | parent | prev | next [-]

it's the `runHeadlessStreaming` function btw

acedTrex 6 hours ago | parent | prev | next [-]

Well, literally no one has ever accused anthropic of having even half way competent engineers. They are akin to monkeys whacking stuff with a stick.

siruwastaken 7 hours ago | parent | prev | next [-]

How is it that a AI coding agent that is supposedly _so great at coding_ is running on this kind of slop behind the scenes. /s

WesolyKubeczek 3 hours ago | parent | next [-]

But it is running, that's the mystery.

rirze 6 hours ago | parent | prev [-]

Because it’s based on human slop. It’s simply the student.

phtrivier 8 hours ago | parent | prev [-]

Yes, if it was made for human comprehension or maintenance.

If it's entirely generated / consumed / edited by an LLM, arguably the most important metric is... test coverage, and that's it ?

mdavid626 7 hours ago | parent | next [-]

Oh boy, you couldn't be more wrong. If something, LLM-s need MORE readable code, not less. Do you want to burn all your money in tokens?

jen20 2 hours ago | parent [-]

I very much doubt Anthropic devs are metered, somehow.

grey-area 8 hours ago | parent | prev | next [-]

LLMs are so so far away from being able to independently work on a large codebase, and why would they not benefit from modularity and clarity too?

olmo23 6 hours ago | parent [-]

I agree the functions in a file should probably be reasonably-sized.

It's also interesting to note that due to the way round-tripping tool-calls work, splitting code up into multiple files is counter-productive. You're better off with a single large file.

konart 8 hours ago | parent | prev | next [-]

Can't we have generated / llm generated code to be more human maintainable?

mrbungie 7 hours ago | parent | prev | next [-]

Can't wait to have LLM generated physical objects that explode on you face and no engineer can fix.

phtrivier 4 hours ago | parent [-]

Oh, do we agree on that. I never said it was "smart" - I just had a theory that would explain why such code could exist (see my longer answer below).

Bayko 7 hours ago | parent | prev [-]

Ye I honestly don't understand his comment. Is it bad code writing? Pre 2026? Sure. In 2026. Nope. Is it going to be a headache for some poor person on oncall? Yes. But then again are you "supposed" to go through every single line in 2026? Again no. I hate it. But the world is changing and till the bubble pops this is the new norm

phtrivier 5 hours ago | parent | next [-]

Sorry, I was not clear enough.

My first word was litteraly "Yes", so I agree that a function like this is a maintenance nightmare for a human. And, sure, the code might not be "optimized" for the LLM, or token efficiency.

However, to try and make my point clearer: it's been reported that anthropic has "some developpers won't don't write code" [1].

I have no inside knowledge, but it's possible, by extension, to assume that some parts of their own codebase are "maintained" mostly by LLMs themselves.

If you push this extension, then, the code that is generated only has to be "readable" to:

* the next LLM that'll have to touch it

* the compiler / interpreter that is going to compile / run it.

In a sense (and I know this is a stretch, and I don't want to overdo the analogy), are we, here, judging a program quality by reading something more akin to "the x86 asm outputed by the compiler", rather than the "source code" - which in this case, is "english prompts", hidden somewhere in the claude code session of a developper ?

Just speculating, obviously. My org is still very much more cautious, and mandating people to have the same standard for code generated by LLM as for code generated by human ; and I agree with that.

I would _not_ want to debug the function described by the commentor.

So I'm still very much on the "claude as a very fast text editor" side, but is it unreasonnable to assume that anthropic might be further on the "claude as a compiler for english" side ?

[1] https://www.reddit.com/r/ArtificialInteligence/comments/1s7j...

heavyset_go 3 hours ago | parent [-]

If that's the case then that's dumb

yoz-y 6 hours ago | parent | prev [-]

The jury on this one is still out.