Remix.run Logo
vasko 2 days ago

The way AI is able to interact with outside resources is pretty impressive, but the quality of code it produces to me is still questionable, more so in the larger scope, and the errors it produces are sometimes hard to catch because they're not normal human errors.

Recently I tried to get Claude to write a script that produces large amounts of code so I could profile a compiler. The script ended up outputing code that uses variables outside of their scope, didn't utilize like 90% of the features of the language, and basically ended up being something that I could make by spamming copy paste.

The script itself was also written in really weird way, utilizing recursion for pretty much everything when most of what it did could be done in simple loops. It ended up being a bit of a nightmare to fix and the entire time I was asking myself "why didn't I just write this in 30 minutes instead of going through all of this".

SyneRyder a day ago | parent | next [-]

This caught my eye:

> The script ended up outputing code that uses variables outside of their scope, didn't utilize like 90% of the features of the language

Using variables outside of their scope sounds very unusual to me for Claude. You are using Claude Opus (4.5 or higher) and have set the thinking to High or above, right? Make sure you're not using Claude Haiku. Sonnet can be okay, but I'm sure the developers you've heard raving about it are all using Opus 4.8 or GPT 5.5, and all using it from within Claude Code or Codex (or OpenCode or Pi, tools like that anyway).

Claude should catch something like variables being outside of scope immediately when compiling, and fix it as soon as it notices the compiler bug.

> The script itself was also written in really weird way, utilizing recursion for pretty much everything when most of what it did could be done in simple loops...

That's actually a great opportunity to develop a new prompt to give to Claude. AI is really good at pattern matching. Take one of those weird recursion methods Claude came up with, then rewrite it as that simple loop that you would prefer, and show both to Claude. Then ask in the same turn: "This is how I prefer to write this code. Can you suggest a prompt to me that would encourage you to write this style of code instead in future?"

See if you can get Claude to reduce that down to a simple maxim or principal you can include in a startup prompt you provide at the start of each session, or into your global CLAUDE.md file that is loaded at the start of every Claude Code session. It might end up being a guideline like "Prefer simple loops over recursion whenever appropriate."

It's possible that the developers you've heard raving about AI have already developed startup prompts / CLAUDE.md files filled with similar maxims & principals, tailored specifically to how they like to code & work, evolved from months of working with AI.

Festro 2 days ago | parent | prev [-]

I can't speak to coding as it's not my area but certainly the pattern I've spotted is that it's best at grunt work. That's where the time savings kick in.

Browsing sites, linking up data, spotting anomalies, writing documentation, formatting documents, etc.

If a task isn't repetitive or doesn't involve ingesting data, then I think the time savings shrink rapidly and the need for oversight increases massively. I think some people are managing to set up enough automated oversight to get round that, but it's adding a layer that multiplies your token usage to do so and still has no guarantee. But certainly all these layers being added are increasing success rates.

Andrei Karpathy is speaking about barely coding now. He has a bias, a comment from him like that is marketing for Anthropic, but I believe he's found some groove with his setup to achieve that.

I think the current status quo this month in 2026 we're at a point where the best tips and tricks to get usable answers out of ChatGPT a year ago have been consolidated into what we know call memory and skills in Claude and other agent harness type systems. You might need to explore those more, in fact I think for Claude Code/Cursor there are even more layers for checking outputs that I've not even seen in Claude Desktop.

And I think your exact issue, and the experience of the vast volumes of people who share it with you, are an audience that the app makers want to better convince. The free tiers and marketing sites are going to step up their game gradually and there will be new features that lower failure rates even more.