Yes, the code is still important. For example, I had tasked Codex to implement function calling in a programming language, and it decided the way to do this was to spin up a brand new sub interpreter on each function call, load a standard library into it, execute the code, destroy the interpreter, and then continue -- despite an already partial and much more efficient solution was already there but in comments. The AI solution "worked", passed all the tests the AI wrote for it, but it was still very very wrong. I had to look at the code to understand it did this. To get it right, you have to either I guess indicate how to implement it, which requires a degree of expertise beyond prompting.

▲

ai-tamer 4 hours ago | parent | next [-]

Do you ask it for a design first? Depending on complexity I ask for a short design doc or a function signature + approach before any code, and only greenlight once it looks sane.

▲

ModernMech 3 hours ago | parent [-]

I understand the "just prompt better" perspective, but this is the kind of thing my undergraduate students wouldn't do, why is the PhD expert-level coder that's supposed to replace all developers doing it? Having to explicitly tell it not to do certain boneheaded things, leave me wondering: what else is it going to do that's boneheaded which I haven't explicit about?

▲

zozbot234 3 hours ago | parent [-]

Because it's not "PhD-expert level" at all, lol. Even the biggest models (Mythos, GPT-Pro, Gemini DeepThink) are nowhere near the level of effort that would be expected in a PhD dissertation, even in their absolute best domains. Telling it to work out a plan first is exactly how you would supervise an eager but not-too-smart junior coder. That's what AI is like, even at its very best.

▲

TeMPOraL an hour ago | parent | next [-]

That's not the best framing, IMO. More important is, even a PhD expert human wouldn't one-shot complex programs out of short, vague requests. There's a process to this. Even a thesis isn't written in one, long, amphetamine-fueled evening. It's a process whose every steps involves thinking, referencing sources, talking with oneself and other people, exploring possibilities, going two steps forward and one step back, and making decisions at every point.

Those decisions are, by large, what humans still need to do. If the problem is complex, and you desperately avoid needing to decide, then what AI produces will surprise you, but in a bad way.

▲

ModernMech 2 hours ago | parent | prev [-]

I understand that but 1) expert-level performance is how they are being sold; but moreover 2) the level of hand-holding is kind of ridiculous. I'll give another example, Codex decided to write two identical functions linearize_token_output and token_output_linearize. Prompting it not to do things like that feels like plugging holes in a dyke. And through prompting, can you even guarantee it won't write duplicate code?

I'll give a third example: I gave Codex some tests and told it to implement the code that would make the tests pass. Codex wrote the tests into the testing file, but then marked them as "shouldn't test", and confirmed all tests pass. Going back I told it something to the effect "you didn't implement the code that would make the tests work, implement it". But after several rounds of this, seemingly no amount of prompting would cause it to actually write code -- instead each time it came back that it had fixed everything and all tests pass, despite only modifying the tests file.

In each example, I keep coming back to the perspective that the code is not abstracted, it's an important artifact and it needs/deserves inspection.

	▲	zozbot234 2 hours ago \| parent [-]
		> the code is not abstracted, it's an important artifact and it needs inspection. That's a rather trivial consideration though. The real cost of code is not really writing it out to begin with, it's overwhelmingly the long-term maintenance. You should strive to use AI as a tool to make your code as easy as possible to understand and maintain, not to just write mountains of terrible slop-quality code.

▲

porridgeraisin 4 hours ago | parent | prev [-]

Yep, all models today still need prompting that requires some expertise. Same with context management, it also needs both domain expertise as well as knowing generally how these models work.