Remix.run Logo
thisisbrians 3 hours ago

It is and will always be about: 1) properly defining the spec 2) ensuring the implementation satisfies said spec

nickjj 3 hours ago | parent | next [-]

> properly defining the spec

Why do you often need to re-prompt things like "can you simplify this and make it more human readable without sacrificing performance?". No amount of specification addresses this on the first shot unless you already know the exact implementation details in which case you might as well write it yourself directly.

I often have to put in a prompt like this 5-10 times before the code resembles something I'd even consider using as a 1st draft base to refactor into something I would consider worthy of being git commit.

I sometimes use AI for tiny standalone functions or scripts so we're not talking about a lot of deeply nested complexity here.

seanmcdirmid 3 hours ago | parent | next [-]

> I often have to put in a prompt like this 5-10 times before the code resembles something I'd even consider using as a 1st draft base to refactor into something I would consider worth of being git commit.

Are you stuck entering your prompts in manually or do you have it setup like a feedback loop like "beautify -> check beauty -> in not beautiful enough beautify again"? I can't imagine why everyone things AIs can just one shot everything like correctness, optimization, and readability, humans can't one shot these either.

nickjj 3 hours ago | parent [-]

I do everything manually. Prompt, look at the code, see if it works (copy / paste) and if it works but it's written poorly I'll re-prompt to make the code more readable, often ending with me making it more readable without extra prompts. Btw, this isn't about code formatting or linting. It's about how the logic is written.

> I can't imagine why everyone things AIs can just one shot everything like correctness, optimization, and readability, humans can't one shot these either.

If it knows how to make the code more readable and / or better for performance by me simply asking "can you make this more readable and performant?" then it should be able to provide this result from the beginning. If not, we're admitting it's providing an initial worse result for unknown reasons. Maybe it's to make you as the operator feel more important (yay I'm providing feedback), or maybe it's to extract the most amount of money it can since each prompt evaluates back to a dollar amount. With the amount of data they have I'm sure they can assess just how many times folks will pay for the "make it better" loop.

seanmcdirmid 3 hours ago | parent [-]

Why do you orchestrate the AI manually? You could write a BUILD file that just does it in a loop a few times, or I guess if you lack build system interaction, write a python script?

> If it knows how to make the code more readable and / or better for performance by me simply asking "can you make this more readable and performant?" then it should be able to provide this result from the beginning.

This is the wrong way to think about AI (at least with our current tech). If you give AI a general task, it won't focus its attention at any of these aspects in particular. But, after you create the code, if you use separate readability and optimization feedback loops where you specifically ask it to work on those aspects of the code, it will do a much better job.

People who feel like AI should just do the right thing already without further prompting or attention focus are just going to be frustrated.

> Btw, this isn't about code formatting or linting. It's about how the logic is written.

Yes, but you still aren't focusing the AI's attention on the problem. You can also write a guide that it puts into context for things you notice that it consistently does wrong. But I would make it a separate pass, get the code to be correct first, and then go through readability refactors (while keeping the code still passing its tests).

giancarlostoro 3 hours ago | parent | prev [-]

There's two secret sauces to making Claude Code your b* (please forgive me future AI overlords), one is to create a spec, the other is to not prompt merely "what" you want and only what you want, but what you want, HOW you want it done (you can get insanely detailed or just vague enough), and even in some cases the why is useful to know and understand, WHO its for sometimes as well. Give it the context you know, don't know anything about the code? Ask it to read it, all of it, you've got 1 million tokens, go for it.

I have one shot prompted projects from empty folder to full feature web app with accounts, login, profiles, you name it, insanely stable, maybe and oops here or there, but for a non-spec single prompt shot, that's impressive.

When I don't use a tool to handle the task management I have Claude build up a markdown spec file for me and specify everything I can think of. Output is always better when you specify technology you want to use, design patterns.

QuadrupleA 2 hours ago | parent | prev | next [-]

Side note, everyone's talking about having AI agents "conform to the spec" these days. Am I in my own bubble, or - who the hell these days gets The Spec as a well-formed document? Let alone a good document, something that can be formally verified, thouroughly test-cased, can christen the software "complete" when all its boxes are ticked, etc.?

This seems like 1980's corporate waterfall thinking, doesn't jibe with the messy reality I've seen with customers, unclear ideas, changing market and technical environments, the need for iteration and experimentation, mid-course correction, etc.

Aurornis an hour ago | parent [-]

> who the hell these days gets The Spec as a well-formed document?

The PMs asked ChatGPT to write a well-formed spec.

Sadly, true in too many companies right now.

I do agree with your general point that The Spec can become a crutch for washing your hands of any responsibility for knowing the product, the goals, the company's business, and other contexts. I like to defuse these ideas by reminding the engineers that The Spec is a living document and they are partially responsible for it, too. Once everyone learns that The Spec isn't a crutch for shifting all blame to the product manager, they become more involved in making sure it's right.

krupan 2 hours ago | parent | prev | next [-]

Good sir, have you heard the Good Word of the Waterfall development process? It sounds like that's what you are describing

bwestergard 3 hours ago | parent | prev | next [-]

That can't be the whole story, right? Because there are an arbitrarily large number of (e.g.) Rust programs that will implement any given spec given in terms of unit tests, types, and perhaps some performance benchmarks.

But even accounting for all these "hard" constraints and metrics, there are clearly reasons to prefer some possible programs over others even when they all satisfy the same constraints and perform equally on all relevant metrics.

We do treat programs as efficient causes[1] of side effects in computing systems: a file is written, a block of memory is updated, etc. and the program is the cause of this.

But we also treat them as statements of a theory of the problem being solved[2]. And this latter treatment is often more important socially and economically. It is irrational to be indifferent to the theory of the problem the program expresses.

[1]: https://en.wikipedia.org/wiki/Four_causes#Efficient

[2]: https://pages.cs.wisc.edu/~remzi/Naur.pdf

MeetingsBrowser 3 hours ago | parent [-]

> there are clearly reasons to prefer some possible programs over others even when they all satisfy the same constraints

Maintainability is a big one missing from the current LLM/agentic workflow.

When business needs change, you need to be able to add on to the existing program.

We create feedback loops via tests to ensure programs behave according to the spec, but little to nothing in the way of code quality or maintainability.

raizer88 3 hours ago | parent | prev | next [-]

AI: "Yes, the specs are perfectly clear and architectural standards are fully respected."

[Imports the completely fabricated library docker_quantum_telepathy.js and calls the resolve_all_bugs_and_make_coffee() method, magically compiling the code on an unplugged Raspberry Pi]

AI: "Done! The production deployment was successful, zero errors in the logs, and the app works flawlessly on the first try!"

ambicapter 3 hours ago | parent | prev | next [-]

Then pulling the lever until it works! You can also code up a little helper to continuously pull the lever until it works!

SV_BubbleTime 3 hours ago | parent [-]

We have a monkeys and typewriters thing for this already.

Just instead of hitting keys, they’re hitting words, and the words have probability links to each other.

Who the hell thinks this is ready to make important decisions?

rawgabbit 3 hours ago | parent | prev | next [-]

I had a CIO tell me 15 years ago with Agile I was wasting my time with specs and design documents.

vidarh 3 hours ago | parent [-]

I was in a call just today where specs were presented as a new thing.

dgxyz 3 hours ago | parent | prev | next [-]

Well it’s more how much we care about those.

Which with the advent of LLMs just lowered our standards so we can claim success.

CodingJeebus 3 hours ago | parent | prev | next [-]

Personally, I get a huge rush of dopamine seeing LLMs build out complex features very quickly to the point that it will keep me up all night wanting to push further and further.

That's where the gambling metaphor really resonates. It's not whether or not the output is correct, I've been building software for many years and I know how direct LLMs pretty well at this point. But I'm also an alcoholic in recovery and I know that my brain is wired differently than most. And using LLMs has tested my ability to self-regulate in ways that I haven't dealt with since I deleted social media years ago.

natpalmer1776 3 hours ago | parent | next [-]

It also doesn’t help that producing features is also wired to a sense of monetary compensation. More-so if you’re building a product to sell that might finally be your ticket to whatever your perception of socio-economic victory is.

CodingJeebus 3 hours ago | parent [-]

That's definitely part of it, sure. I also just get a cosmic kick out thinking about the possibilities that this technology unlocks and that thinking can spiral in all sorts of unhealthy ways.

acedTrex 3 hours ago | parent | prev [-]

> Personally, I get a huge rush of dopamine seeing LLMs build out complex features very quickly

I dont think i've read a sentence on this website i can relate to less.

I watch the LLM build things and it feels completely numb, i may as well be watching paint dry. It means nothing to me.

zer00eyz 3 hours ago | parent | next [-]

I wonder if the difference here is age/experience or what you're working on/in.

When I was 20, writing code was interesting, by the time I was 28 it became "solving the problem" and then moved on to "I only really enjoy a good disaster to clean up".

All of my time has been spent solving other peoples problems, so I was never invested in the domain that much.

MrScruff 2 hours ago | parent [-]

Yeah, I used to enjoy writing code but after a while I realised I actually more enjoy creating tools that I (and other people) liked to use. Now I can do that really quickly even with my very limited free time, at a higher level of abstraction, but it's still me designing the tool.

And despite the amount of people telling me the code is probably awful, the tools work great and I'm happily using them without worrying about the code anymore than I worry about the assembly generated by a compiler.

CodingJeebus 3 hours ago | parent | prev [-]

Trust me, I have many days where I wish I had your relationship to this. I wish it were as boring as watching paint dry. But it triggers that part of my brain that wants more, and I have to be very careful about that.

BurningFrog 3 hours ago | parent | prev [-]

That was always the easy part.

The endless next steps of "and add this feature" or "this part needs to work differently" or "this seems like a bug?" or "we must speed up this part!" is where 98% of the effort always was.

Is it different with AI coding?