Hey! I would encourage you to try our Claude code instead, which is also part of your subscription. It's a CLI that takes care of many of the issues you encountered, as it works directly on the code files in a directory. No more copy pasting or unscrambling results. Likewise, it can run commands itself to e.g. compile or even test code.

▲

Rochus 5 days ago | parent | next [-]

I'm working on old hardware and not-recent Linux and compiler versions, and I have no confidence yet in allowing AI direct (write) access to my repositories.

Instead I provided Claude with the source code of a transpiler to C (one file) which is known to work, uses the same IR as the new code generators were supposed to use.

This is a controlled experiment with a clear and complete input and clear expectations and specifications of the output. I don't think I would be able to clearly isolate the contributions and assess the performance of Claude when it has access to arbitrary parts of the source code.

▲

stavros 5 days ago | parent | next [-]

I use Claude Code with the Max plan, and the experience isn't far off from what you describe. You still need to understand the system and review the implementation, because it makes many mistakes.

That's not the part it saves me time in, it saves me time in looking up the documentation. Other than that, it might be slower, because the larger the code change is, the more time I need to spend reviewing, and past a point I just can't be bothered.

The best way I've found is to have it write small functions, and then I tell it to compose them together. That way, I know exactly what's happening in the code, and I can trust that it works correctly. Cursor is probably a better way to do that than Claude Code, though.

▲

t_mahmood 5 days ago | parent | next [-]

So, I am paying $20, for a glorified code generator, that may or may not be correct, to write a small function that I can do for free, and be confident about the correctness, if I have not been lazy to implement a test for it.

If you point out, with test it's also the same with any AI tool available, but to come to that result, I have to continuously prompt it till it gives me the desired output, while I may be able to do it in 2/3 iterations.

Reading documentation always made me little bit knowledgeable than before, while prompting the LLM, gives me nothing of knowledge.

And, I also have to decide which LLM would be good for the task at hand, and most of them will not be free (unless I use a local, but that will also use GPU, and add an energy cost)

I may be nitpicking, but I see too many holes with this approach

▲

stavros 5 days ago | parent | next [-]

The biggest hole you don't see is that it's worth the $20 to make me overcome my laziness, because I don't like writing code, but I like making stuff, and this way I can make stuff while fooling my brain into thinking I'm not writing code.

	▲	t_mahmood 5 days ago \| parent [-]
		Sure, that can be a point, which is helping you overcome your personal barrier, But that can be anything, That is not you were vouching for on the original comment. It was about saving time.

▲

weard_beard 5 days ago | parent | prev | next [-]

Not only that, but the process described is how you train a junior dev.

There, at least, the wasted time results in the training of a human being who can become sophisticated enough to become a trusted independent implementer in a relatively short duration

▲

turtlebits 5 days ago | parent | prev [-]

Your time isn't free, and I'd certainly with more than $20/month.

I find it extremely useful as a smarter autocomplete, especially for the tedious work - changing function definitions, updating queries when DB schema changes, and writing http requests/api calls from vendor/library documentation.

	▲	t_mahmood 5 days ago \| parent [-]
		Certainly, So I use an IDE, IntelliJ Ultimate to be precise. None of the use-cases you mention requires LLM. Just available as IDE functionalities. IntelliJ has LLM based auto complete, with which I am okay, But it still wrong too many times. Works extremely well with Rust. Their non-llm autocomplete is also superb, which uses ML for suggesting closest, relevant match, IIRC. It also makes refactoring a breeze, I know what it's going to do exactly. Also, it can handle database refactoring to a certain capacity! And for that it does not require LLM, so no nondeterministic behavior. Also, the IDE have its own way of doing http requests, and it's really nice! But, I can use their live template to do autocomplete any boilerplate code. It only requires setting once. No need to fiddle with prompts.

▲

mattacular 5 days ago | parent | prev | next [-]

> The best way I've found is to have it write small functions, and then I tell it to compose them together.

Pretty much how I code without AI, except making my brain break the problem down into small functions and expressing them in code rather than as a chat.

▲

Rochus 5 days ago | parent | prev [-]

> it saves me time in looking up the documentation

I have a Perplexity subscription which I heavily use for such purpose, just asking how something works or should be used, with a response just on the point and with examples. Very useful indeed. Perplexity gives me access to Claude Sonnet 4 w/o Thinking which I consider great models, and it can also generate decent code. My intention was to find out how good the recent Claude Opus is in comparison and how much of my work I'm able to delegate. Personally I much prefer the user interface features, performance and availability of Perplexity to Claude.ai.

▲

gommm 5 days ago | parent | next [-]

I end up using Perplexity a lot too, especially when I'm doing something unfamiliar. It's also a good way to quickly find out what are best practices for a given framework/language I'm not that familiar with (I usually ask it to link to examples in the wild and it find opensource projects illustrating those points)

▲

stavros 5 days ago | parent | prev [-]

I have both, and Perplexity is much more like a search engine than a chat companion (or at least that's how I use it). I like both, though.

▲

Rochus 5 days ago | parent [-]

You can select the model. I very much appreciate the Claude Sonnet models which are very good and rational discussion partners, responding to arguments in detail and critically, allowing for the dialectical exploration of complex topics. I have also experimented with other models including ChatGPT, Gemini or Grok, but the resulting discussions were only a fraction as useful (i.e. more optimized towards affirmative feel-good small talk, from my humble point of view).

	▲	stavros 5 days ago \| parent [-]
		Hmm, I've never tried that, even though I prefer Claude in general too. I'll try that, thanks!

▲

fluidcruft 5 days ago | parent | prev | next [-]

claude-code asks you to allow it to do anything before it does them. Once you start trusting it and get comfortable with its behavior it gets annoying being prompted all the time, so you can whitelist specific commands it wants to run. You can also interactively toggle into (and out of) "accept changes without asking" mode.

(It wasn't clear to me that I would be able to toggle out of accept changes mode, so I resisted for a loooooong time. But turns out it's just a toggle on/off and can be changed in real-time as it's chugging along. There's also a planning state but haven't looked into that yet)

It always asks before running commands unless you whitelist them. I have whitelisted running testsuites and linters, for example so it can iterate on those corners with minimal interaction. I have had to learn to let it go ahead and make small obvious mistakes rather than intervene immediately because the linters and tests will catch them and Claude will diagnose the failure and fix them at that point.

Anyway I took a small toy project and used that to get a feel for claude-code. In my experience using the /init command to create CLAUDE.md (or asking Claude to interview you to create it) is vital for consistent behavior.

I haven't had good "vibe" experiences yet. Mostly I know what I want to do and just basically delegate implementation. Some things that have worked well for me is to ask Claude to propose a few ways to improve or implement a feature. It's come up with a few things I hadn't thought of that way.

Anyway, claude-code was very good at slowly and incrementally earning my trust. I resisted trying it because I expected it would just run hogwild doing bewildering things, but that's not what it does. It tends to be a bit of an asskisser in it's communication style in a way that would annoy me if it were a real person. But I've managed to look past that.

▲

kace91 5 days ago | parent | prev | next [-]

On Claude you specifically accept any attempt to use a terminal command (optionally whitelisting) so there’s no risk that it will push force something or whatever. You can also whitelist with granularity, for example to enable it to use git to view git logs but not commit.

You can just let it work, see what’s uncommitted after it’s over, and get rid of the result if you don’t like it.

▲

kelnos 5 days ago | parent | prev [-]

> I have no confidence yet in allowing AI direct (write) access to my repositories.

You don't need to give it write access to your repositories, just to a source tree.

▲

boesboes 5 days ago | parent | prev | next [-]

I've been trying it for a couple of months, I can't recommend it either tbh. It's frustrating as hell to work with: super inconsistent, very bad at following its own instructions, wasteful and generally unreliable.

The problem is, it's like a very, very junior programmer that knows the framework well, but won't use it consistently and doesn't learn from mistakes AT ALL. And has amnesia. Fine for some trivial things, but anything more complicated the hand-holding becomes so involved you are better off doing it yourself. That way you internalise some of the solutions as well, which is nice because then you can debug it later! Now I have a huge PR that even I myself don't really grasp as much as I would want.

But for me the nail in the coffin was the terrible customer service. ymmv.

▲

jennyholzer 5 days ago | parent | prev [-]

[flagged]

▲

Rochus 5 days ago | parent | next [-]

Do you mean Claude code should fail? Why?

▲

jennyholzer 5 days ago | parent [-]

[flagged]

▲

Rochus 5 days ago | parent [-]

In what specific programming language/toolchain/technology is your experience? Why do you think that "everybody can tell that chat gpt wrote your code"? Meanwhile I looked at a lot of LLM generated code in different languages, and I wouldn't generally subscribe to your statement. And you still didn't explain why Claude should fail. I think it is rather an advantage (when it reliably works in future).

	▲	blks 5 days ago \| parent [-]
		I can tell when my coworkers Go code is generated by an llm. I hate it very much.

▲

actionfromafar 5 days ago | parent | prev [-]

Wow, shots fired! Would you add something to that?

▲

jennyholzer 5 days ago | parent [-]

i spend a lot of time fixing unacceptably poor code that LLM platforms have tricked human coworkers into finding adequate.

my coworkers are increasingly ignorant about the software products they work on.

LLM-informed software development is organizationally poisonous.

Businesses selling LLM coding tools occupy the same place in my mind as drug dealers.

	▲	kelnos 5 days ago \| parent [-]
		Feels like that's more of a problem with the competence of your coworkers than with the LLM. The LLM is just exposing how bad they are.