The thing is, it still feels like a mixed bag for me.

It's good enough for things I can define well and write okay code for.

But it is far from perfect.

It does too much, like any LLM. For example, I had some test cases for deleted methods, and I was being lazy and didn't want to read a huge test file, so I asked it to fix it.

It did. Tests were green because it mocked non-existing methods, while it should have just deleted the test cases as they were no longer needed.

Luckily, I read the code it produced.

The same thing happened with a bit of decorators I asked it to write in Python. It produced working code, tests were fine, but I reworked the code manually to 1/10 of the size proposed by Opus.

It seems magical, even thinking, but like all LLMs, it is not. It is just a trap.

▲

j16sdiz 3 days ago | parent | next [-]

Small tips:

When LLMs try to do the wrong thing, don't correct it with new instruction. Instead, edit your last prompt and give more details there.

LLM have limited context length, and they love stuck to their previous error. Just edit the previous prompt. Don't let the failed attempt pollute your context.

	▲	sielakis 2 days ago \| parent [-]
		I know. It was just me being too lazy to write proper prompt. And code size thing is not fixed by better prompt. It also likes to even ignore reasonable plan it writen itself just to add more code.

▲

risyachka 2 days ago | parent | prev [-]

>> but I reworked the code manually to 1/10 of the size proposed by Opus.

yeah it writes so much code its crazy - where it can be solved, like you mentioned, with 1/10th

I mean they are in the token business, so this is expected to continue as long as they possibly can as long as they are a bit better than competition.

This is what 99% of devs that praise Claude Code don't notice. The real productivity gains are much lower than 10x.

Maybe they are like 2x tops.

The real gains is that you can be lazy now.

In reality most tasks you do with LLM (not talking about greenfield projects, those are vanity metrics) can be completed by human in mostly same time with 1/10th of code - but the catch here is you need to actually think and work instead of talking to chat or watching YouTube while prompt is running, which becomes 100x harder after you use LLM extensively for a week or so.