I first tried getting specific with Claude Code. I made the Claude.md, I detailed how to do TDD, what steps it should take, the commands it should run. It was imperfect. Then I had it plan (think hard) and write the plan to a file. I’d clear context, have it read the plan, ask me questions, and then have it decompose the plan into a detailed plan of discrete tasks. Have it work its way through that. It would inevitably go sideways halfway through, even clearing context between each task. It wouldn’t run tests, it would commit breakage, it would flip flop between two different broken approaches, it was just awful. Now I’ve just been vibing, writing as little as possible and seeing what happens. That sucks, too.

It’s amazing at reviewing code. It will identify what you fear, the horrors that lie within the codebase, and it’ll bring them out into the sunlight and give you a 7 step plan for fixing them. And the coding model is good, it can write a function. But it can’t follow a plan worth shit. And if I have to be extremely detailed at the function by function level, then I should be in the editor coding. Claude code is an amazing niche tool for code reviews and dialogue and debugging and coping with new technologies and tools, but it is not a productivity enhancement for daily coding.

▲

liszper 4 days ago | parent | next [-]

With all due respect, you sound like someone who is just getting familiar with these tools. 100 more hours spent with AI coding and you will be much more productive. Coding with AI is a slightly different skill from coding, similar how managing software engineers is different from writing software.

▲

abtinf 4 days ago | parent | next [-]

liszper:

> most SWE folks still have no idea how big the difference is between the coding agents they tried a year ago and declared as useless and chatgpt 5 paired with Codex or Cursor today

Also liszper: oh, you tried the current approach and don’t agree with me? Well you just don’t know what you are doing.

▲

bubblyworld 4 days ago | parent | next [-]

Lol, what is up with everyone assuming there's no learning curve to these things? If you applied this argument to literally any other tool you would be laughed at, for good reason.

▲

bluefirebrand 4 days ago | parent [-]

Probably because "there's no learning curve they are just magic tools" is how they are marketed and how our managers are expecting them to work

	▲	bubblyworld 4 days ago \| parent [-]
		Sure, but people are allowed to have their own opinions too.

▲

liszper 4 days ago | parent | prev | next [-]

Yes, exactly. Learning new things is hard. Personally it took me about 200 hours to get started, and since then ~2500 hours to get familiar with the advanced techniques, and now I'm very happy with the results, managing extremely large codebases with LLM in production.

For context before that I had ~15 years of experience coding the traditional way.

▲

chownie 4 days ago | parent | next [-]

Has anyone else noticed the extreme dichotomy of developers using AI agents? Either AI agents essentially don't work, or they are apparently running legions of agents to produce some nebulous gigantic estate.

I think the crucial difference is that I do actually see evidence (ie the codebase) posted sometimes for the former, the latter could well be entirely mythos -- a 24 day old account evangelizing for the legion of agents story does kind of fit the theme.

▲

azinman2 4 days ago | parent | prev | next [-]

You should write a blog post about your learnings. If you could even give some high level highlights here that’d be really helpful.

▲

4 days ago | parent | prev | next [-]

[deleted]

▲

sarchertech 4 days ago | parent | prev [-]

How many users is production and how large is extremely large.

▲

liszper 4 days ago | parent [-]

200k DAU, 7 million registered, ~50 microservices, large monorepo

▲

sarchertech 4 days ago | parent | next [-]

You have 50 microservices for 200k daily users?

Let me guess this has something to do with AI?

	▲	liszper 4 days ago \| parent [-]
		No, It has something to do with experience. The system is highly integrated to other platforms and have to stay afloat during burst loads.

▲

pjc50 4 days ago | parent | prev [-]

.. what is this thing and can we see it?

▲

liszper 4 days ago | parent [-]

you can OSINT me pretty easily, not going to post it here for the sake of anonymity against crawlers who train models on our conversations. today's HN comments are tomorrow's coding LLMs

	▲	4 days ago \| parent [-]
		[deleted]

▲

joquarky 19 hours ago | parent | prev | next [-]

> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

https://news.ycombinator.com/newsguidelines.html

▲

pjc50 4 days ago | parent | prev [-]

Funnily enough the same kind of approach you get from Lisp advocates and the more annoying faction of Linux advocacy (which isn't as prevalent these days, it seems)

	▲	klibertp 4 days ago \| parent \| next [-]
		> the same kind of approach you get from Lisp In what way? Lisp (Common Lisp) is the most stable and unchanging language out there. If you learned it anytime after the late 80s, you still know it, and will know it until the end of time. Meanwhile, here, we hear that "a year ago" is so much time that everything changed (for the better, of course). Or is it about needing some serious time investment to get comfortable with Lisp? Even then, once you do spend enough time that s-exprs stop being a problem, that's it; there's nothing else to be getting comfortable with, and certainly, you won't need to relearn any of that a year into the future. I don't think AI coding and Lisp are comparable, even considering just the tone of messages on the topic (as far as I can see, "smug lisp weenies" are a thing of the ancient past).
	▲	liszper 4 days ago \| parent \| prev [-]
		I'm also a lisper, yes.

▲

ryandrake 4 days ago | parent | prev | next [-]

I'm starting to kind of dig C.C. but you're right, it definitely feels like a very confident, very ambitious high schooler level developer with infinite energy. You really have to give it very small tasks and be constantly steering its direction. At the end of the day, I'm not sure I'm really saving that much time coaching Claude to do the job right vs. just writing the code myself, but it's definitely a neat trick.

The difference from an actual junior developer, of course, is that the human junior developer learns from his mistakes and gets better, but Claude seems to be stuck at the level of expertise of its model, and you have to wait for the model to improve before Claude improves.

▲

jonstewart 4 days ago | parent [-]

The thing I am calling BS on is that there's much productivity gain in giving it very small tasks and constantly steering its direction. For 80% of code, I'm faster than it if that's what I have to do. For debugging? For telling it to integrate a new tool? Port my legacy build system to something better? It's great at that, removes some important barriers to entry.

	▲	rmunn 4 days ago \| parent [-]
		Bingo. All my experience is on Linux, and I've never written anything for Windows. So recently when I needed to port a small C program to Windows, I told ChatGPT "Hey, port this to Windows for me". I wouldn't trust the result, I'd rewrite it myself, but it let me find out which Win32 API functions I'd be calling, and why I'd be calling them, faster than searching MSDN would have done.

▲

jonstewart 4 days ago | parent | prev | next [-]

I think it has more to do with the kind of software I write and their requirements then it has to do with spending more time with this current tool. For some things it's great, but it's been a net productivity loss for me on my core coding responsibilities.

▲

zmmmmm 4 days ago | parent | prev | next [-]

ah, they are holding it wrong.

I am always so skeptical of this style of response. Because if it takes hundreds of hours to learn to use something, how can it really be the silver bullet everyone was claiming earlier? Surely they were all in the midst of the 100 hours. And what else could we do if we spent 100 hours learning something? It's a lot of time, a huge investment, all on faith that things will get better.

	▲	jmatthews 2 days ago \| parent [-]
		How many hours do you have mastering git or your IDE or your library of choice for UX?

▲

TheRoque 4 days ago | parent | prev [-]

Then, it's the job of someone else to use these tools, not developers

▲

liszper 4 days ago | parent [-]

I agree with your point. I think this is the reason why most developers still don't get it, because AI coding ultimately requires a "higher level" methodology.

▲

dgfitz 4 days ago | parent [-]

"Hacker culture never took root in the 'AI' gold rush because the LLM 'coders' saw themselves not as hackers and explorers, but as temporarily understaffed middle-managers." [0]

This, this is you. This is the entire charade. It seems poetic somehow.

[0]https://news.ycombinator.com/item?id=45123094

▲

liszper 4 days ago | parent [-]

I see myself as a hacker.

▲

dgfitz 4 days ago | parent [-]

By your own exposition, you aren’t a hacker.

	▲	liszper 4 days ago \| parent [-]
		hackers can also cook and not become a chef

▲

cadamsdotcom 4 days ago | parent | prev [-]

Don’t give up on TDD.

I’ve invested hundreds of hours in process and tooling, and can now ship major features with tests in record time with Claude Code.

You have to coach it in TDD - no matter how much you explain in CLAUDE.md. That’s part because “a test that fails because the code isn’t written yet” is conceptually very similar to “a test that passes without the code we’re about to write” and is also similar to “a test that asserts the code we’re about to write is not there”. You have to watch closely to make sure it produces the first thing.

Why does it keep getting confused? You can’t blame it really. When two things are conceptually similar, models need lots of examples to distinguish between them. If the set of samples is sparse the model is likely to jump the small distance from a concept to similar ones.

So, you have to accept this as how Claude 4 works, keep it on a short leash, keep reminding it that it must watch the test fail, ask it if the test failed for the right reason (not some setup issue), and THEN give it permission to write the code.

The result is two mirror copies of your feature or fix: code and tests.

Reviewing code and tests together is pleasant because they mirror one another. The tests forever ensure your feature works as described, no manual testing needed, no regressions. And the model knows all the tricks to make your tests really beautiful.

TDD is the check and balance missing from most people’s agentic software dev process.

	▲	jonstewart 3 days ago \| parent [-]
		Oh, I will never give up on TDD. And the assistants are great at helping to write tests, and especially analyzing the tests you have and suggesting others for edge cases. But I have repeatedly seen claude get hung up on TDD itself and I've tried lots of different prompts/directions. It runs into a problem and inevitably runs ever more complicated shell commands and creating weird temp input files than sticking to "cargo test" and addressing the failing test. Since I need to review the agent's code, I'd much prefer it to use a workflow like a human, with a progression of small commits following TDD--much easier to review the code then. If it's just splatting up big diffs, then it makes review harder, and that offsets any productivity gains.