Remix.run Logo
pron 10 hours ago

As someone who's both an IC and leads other developers I disagree with the explanation. As a technical lead, with people I can much better predict the quality of the outcome than with LLMs, and the "failure modes" are much more manageable. As a programmer, I am actually more impressed with AI agents but in an informed and qualified way. Their debugging ability wows me; their coding ability disappoints and frustrates me.

I think that the simple explanation for why executives are so hyped about AI is simply that they're not familiar with its severe current limitations. For example, Garry Tan seems to really believe he's generating 10KLOC of working code per day; if he'd been a working developer he would have known he isn't.

orochimaaru 9 hours ago | parent | next [-]

And one executive talks to other executives not to their engineers. I think this is more peer pressure than anything else.

garciasn 9 hours ago | parent | next [-]

I lead a team of Data Engineers, DevOps Engineers, and Data Scientists. I write code and have done so literally for my entire life. AI-assisted codegen is incredible; especially over the last 3-4m.

I understand that developers feel their code is an art form and are pissed off that their life’s work is now a commodity; but, it’s time to either accept it and move on with what has happened, specialize as an actual artist, or potentially find yourself in a very rough spot.

staticassertion 9 hours ago | parent | next [-]

I wonder if your background just has you fooled. I worked on a data science team and code was always a commodity. Most data scientists know how to code in a fairly trivial way, just enough to get their models built and served. Even data engineers largely know how to just take that and deploy to Spark. They don't really do much software engineering beyond that.

I'm not being precious here or protective of my "art" or whatever. But I do find it sort of hilarious and obvious that someone on a data science team might not understand the aesthetic value of code, and I suspect anyone else who has worked on such a team/ with such a team can probably laugh about the same thing - we've uh... we've seen your code. We know you don't value aesthetic code lol. Single variable names, `df1`, `df2`, `df3`.

I'm not particularly uncomfortable at the moment because understanding computers, understanding how to solve problems, understanding how to map between problems and solutions, what will or won't meet a customer's expectations, etc, is still core to the job as it always has been. Code quality is still critical as well - anyone who's vibe-coded >15KLOC projects will know that models simply can not handle that scale unless you're diligent about how it shoul dbe structured.

My job has barely changed semantically, despite rapid adoption of AI.

garciasn 9 hours ago | parent | next [-]

I understand that you’re trying to apply your experience to what we do as a team and that makes sense; but, we’re many many stddev beyond the 15K LOC target you identified and have no issues because we do indeed take care to ensure we’re building these things the right way.

staticassertion 9 hours ago | parent [-]

So you understand and you agree and confirm my experience?

garciasn 9 hours ago | parent [-]

I have worked at many places and have seen the work of DEs and DSs that is borderline psychotic; but it got the job done, sorta. I have suffered through QA of 10000 lines that I ended up rewriting in less than 100.

So, yes; I understand where you’re coming from. But; that’s not what we do.

staticassertion 9 hours ago | parent [-]

Yes, but then you said that you do what I'm suggesting is still critical to do, which is maintain the codebase even if you heavily leverage models. " we do indeed take care to ensure we’re building these things the right way."

5 hours ago | parent [-]
[deleted]
bdangubic 9 hours ago | parent | prev [-]

> We know you don't value aesthetic code lol. Single variable names, `df1`, `df2`, `df3`.

https://degoes.net/articles/insufficiently-polymorphic

> My job has barely changed semantically, despite rapid adoption of AI.

it's coming... some places move slower than other but it's coming

staticassertion 9 hours ago | parent [-]

> https://degoes.net/articles/insufficiently-polymorphic

lol this is not why people do "df1", "df2", etc, nor are those polymorphic names but okay.

> it's coming... some places move slower than other but it's coming

What is coming, exactly? Again, as said, I work at a company that has rapidly adopted AI, and I have been a long time user. My job was never about rapidly producing code so the ability to rapidly produce code is strictly just a boon.

orochimaaru 7 hours ago | parent | prev | next [-]

My problem is that c suite equates “vibe coding” and what you need is spec driven dev.

Spec driven dev is good software engineering practice. It’s been cast aside in the name of “agile” (which has nothing to do with not doing docs - but that’s another discussion).

My problem is writing good specs takes time. Reviewing code and coaxing the codegen to use specific methods (async, critical sections, rwlocks, etc) is based on previous dev experience. The general perception with c suite is that neither is important now since “vibing” is what’s in.

hiq 9 hours ago | parent | prev | next [-]

> their life’s work is now a commodity

Which parts of it exactly? I've considered for loops and if branches "commodities" for a while. The way you organize code, the design, is still pretty much open and not a solved problem, including by AI-based tools. Yes we can now deal with it at a higher level (e.g. in prompts, in English), but it's not something I can fully delegate to an agent and expect good results (although I keep trying, as tools improve).

LLM-based codegen in the hands of good engineers is a multiplier, but you still need a good engineer to begin with.

pron 9 hours ago | parent | prev | next [-]

My problem with the code the agents produce has nothing to do with style or art. The clearest example of how bad it is was shown by Anthropic's experiements where agents failed to write a C compiler, which is not a very hard programming job to begin with if you know compilers, as the models do, but they failed even with a practically unrealistic level of assistance (a complete spec, thousands of human-written tests, and a reference implementation used as an oracle, not to mention that the models were trained on both the spec and reference implementation).

If you look at the evolution of agent-written code you see that it may start out fine, but as you add more and more features, things go horribly wrong. Let's say the model runs into a wall. Sometimes the right thing to do is go back into the architecture and put a door in that spot; other times the right thing to do is ask why you hit that wall in the first place, maybe you've taken a wrong turn. The models seem to pick one or the other almost at random, and sometimes they just blast a hole through the wall. After enough features, it's clear there's no convergence, just like what happened in Anthropic's experiment. The agents ultimately can't fix one problem without breaking something else.

You can also see how they shoot themselves in the foot by adding layers upon layers of defensive coding that get so think they themselves can't think through them. I once asked an agent to write a data structure that maintains an invariant in subroutine A and uses it in subroutine B. It wrote A fine, but B ignored the invariant and did a brute-force search over the data, the very thing the data structure was meant to avoid. As it was writing it the agent explained that it doesn't want to trust the invariant established in A because it might be buggy... Another thing you frequently see is that the code they write is so intent on success that it has a plan A, plan B, and plan C for everything. It tries to do something one way and adds contingencies for failure.

And so the code and the complexity compound until nothing and no one can save you. If you're lucky, your program is "finished" before that happens. My experience is mostly with gpt5.4 and 5.3-codex, although Anthropic's failed experiment shows that the Claude models suffer from similar problems. What does it say when a compiler expert that knows multiple compilers pretty much by heart, with access to thousands of tests, can't even write a C compiler? Most important software is more complex than a C compiler, isn't as well specified, and the models haven't trained on it.

I wish they could write working code; they just don't.[1] But man, can they debug (mostly because they're tenacious and tireless).

[1]: By which I don't mean they never do, but you really can't trust them to do it as you can a programmer. Knowing to code, like knowing to fly a plane, doesn't mean sometimes getting the right result. It means always getting the right result (within your capabilities that are usually known in advance in the case of humans).

simianwords 6 hours ago | parent [-]

The thing is for most places the kind of code they write is good enough. You have painted an awfully pessimistic picture that frankly does not mirror reality of many enterprises.

> What does it say when a compiler expert that knows multiple compilers pretty much by heart, with access to thousands of tests, can't even write a C compiler?

It does not know compilers by heart. That's just not true. The point of the experiment was to see how big of a codebase it can handle without human intervention and now we know the limits. The limitation has always been context size.

>By which I don't mean they never do, but you really can't trust them to do it as you can a programmer. Knowing to code, like knowing to fly a plane, doesn't mean sometimes getting the right result. It means always getting the right result (within your capabilities that are usually known in advance in the case of humans).

Getting things right ~90% of the time still saves me a lot of time. In fact I would assume this is how autopilot also works in that it does 90% of a job and the pilot is required to supervise it.

_dwt 9 hours ago | parent | prev [-]

> literally

You were either a very talented baby or we’re justified in questioning your ability to assess the correctness of nitpicky formalisms.

garciasn 9 hours ago | parent [-]

Funny.

pydry 9 hours ago | parent | prev [-]

They're oddly credulous of the shovel salesmen in the gold rush, too.

E.g. when Jensen Huang said that you need to pair your $250k engineer with $250k of tokens.

bitwize 3 hours ago | parent | prev | next [-]

A friend of mine works at a place whose CEO has been completely one-shotted; he vibe-coded an app and decided this could multiply their productivity like a hundredfold. And now he's implementing an AI mandate for every employee, replete with tracking and metrics and the threat of being fired of you don't play ball.

I was explaining this to my wife, who asked, why doesn't the CEO understand the limitations and the drawbacks the programmers are experiencing. And I said—he doesn't care, because he's looking at what other businesses are doing, what they're writing about in Bloomberg and WSJ, what "industry best practice is", and where the money is going. Trillions of dollars are going in to revolutionizing every industry with AI. If you're a CEO and you're not angling to capture a piece of that, then the board is going to have some serious questions about your capability to lead the company. Executives are often ignorant of the problems faced by line workers in a way perhaps best explained by a particular scene from Swordfish (2001): "He lives in a world beyond your world..." https://www.youtube.com/watch?v=jOV6YelKJ-A The complaints of a few programmers just don't matter when you have millions or billions of capital at your command, and business experts are saying you can tenfold your output with half the engineering workforce.

Right now there are only two choices for programmers: embrace generative AI fully and become proficient at it. Instead of surfacing problems with it, offer solutions: how can we use AI to make this better? Or have a very, very hard time working in the field.

Aperocky 9 hours ago | parent | prev | next [-]

pass a certain LOC number, the utility become negative (unless they were pure tests).

apothegm 7 hours ago | parent | next [-]

Even the utility of too many tests is negative. More upkeep and harder to change the code.

And with LLMs also more context and token usage and cost.

Aperocky 7 hours ago | parent [-]

The biggest differentiating factor today is engineers and/or decision maker willing to say no to a certain feature or implementation.

It's too easy to add bloat and complexity that can never go away, and with the tooling we now have a significant portion of engineers are now active risk to the projects they are working on.

simianwords 6 hours ago | parent | prev [-]

I disagree because I use it in a pretty huge codebase and it definitely saves time.

detourdog 9 hours ago | parent | prev [-]

I'm sure an IC is not an integrated circuit or independent contractor. So what is it?

rkomorn 2 hours ago | parent | next [-]

Individual Contributor, usually as opposed to any kind of management role.

auggierose 2 hours ago | parent | prev [-]

It is silicon valley speech for "programmer".