Remix.run Logo
itissid 3 days ago

My experience with Claude code beyond building anything bigger than a webpage, a small API, a tutorial on CSS etc has been pretty bad. I think context length is a manageable problem, but not the main one. I used it to write a 50K LoC python code base with 300 unit tests and it went ok for the first few weeks and then it failed. This is after there is a CLAUDE.md file for every single module that needs it as well as detailed agents for testing, design, coding and review.

I won't going into a case by case list of its failures, The core of the issue is misaligned incentives, which I want to get into:

1. The incentives for coding agent, in general and claude, are writing LOTS of code. None of them — O — are good at the planning and verification.

2. The involvement of the human, ironically, in a haphazard way in the agent's process. And this has to do with how the problem of coding for these agents is defined. Human developers are like snow flakes when it comes to opinions on software design, there is no way to apply each's preference(except paper machet and superglue SO, Reddit threads and books) to the design of the system in any meaningful way and that makes a simple system way too complex or it makes a complex problem simplistic.

  - There is no way to evolve the plan to accept new preferences except text in CLAUDE.md file in git that you will have to read through and edit.

  - There is no way to know the near term effect of code choices now on 1 week from now. 

  - So much code is written that asking a person to review it in case you are at the envelope and pushing the limit feels morally wrong and an insane ask. How many of your Code reviews are instead replaced by 15-30 min design meetings to instead solicit feedback on design of the PR — because it so complex — and just push the PR into dev? WTF am I even doing I wonder.

  - It does not know how far to explore for better rewards and does not know it better from local rewards, Resulting in commented out tests and deleting arbitrary code, to make its plan "work".
In short code is a commodity for CEOs of Coding agent companies and CXOs of your company to use(sales force has everyone coding, but that just raises the floor and its a good thing, it does NOT lower the bar and make people 10x devs). All of them have bought into this idea that 10x is somehow producing 10x code. Your time reviewing and unmangling and mainitaining the code is not the commodity. It never ever was.