Remix.run Logo
bandrami a day ago

I don't write code for a living but I administer and maintain it.

Every time I say this people get really angry, but: so far AI has had almost no impact on my job. Neither my dev team nor my vendors are getting me software faster than they were two years ago. Docker had a bigger impact on the pipeline to me than AI has.

Maybe this will change, but until it does I'm mostly watching bemusedly.

kdheiwns 20 hours ago | parent | next [-]

Yep. All AI has done for me is give me the power of how good search engines were 10+ years ago, where I could search for something and find actually relevant and helpful info quickly.

I've seen lots of people say AI can basically code a project for them. Maybe it can, but that seems to heavily depend on the field. Other than boilerplate code or very generic projects, it's a step above useless imo when it comes to gamedev. It's about as useful as a guy who read some documentation for an engine a couple years ago and kind of remembers it but not quite and makes lots of mistakes. The best it can do is point me in the general direction I need to go, but it'll hallucinate basic functions and mess up any sort of logic.

kranner 19 hours ago | parent | next [-]

My experience is the same. There are modest gains compensating for lack of good documentation and the like, but the human bottlenecks in the process aren't useless bureaucracy. Whether or not a feature or a particular UX implementation of it makes sense, these things can't be skipped, sped up or handed off to any AI.

freddref 17 hours ago | parent [-]

What are these bottlenecks specifically that you feel are essential?

Am trying to compare this to reports that people are not reviewing code any more.

kranner 15 hours ago | parent [-]

When features and their exact UI implementations are being developed, feedback and discussions around those things.

bee_rider 20 hours ago | parent | prev | next [-]

Thinking of it, I haven’t seen as many “copy paste from StackOverflow” memes lately. Maybe LLMs have given people the ability to

1) Do that inside their IDEs, which is less funny

2) Generate blog post about it instead of memes

Izkata 4 hours ago | parent [-]

Python one-upped that a long time ago:

  from stackoverflow import quick_sort
https://github.com/drathier/stack-overflow-import
redhed 5 hours ago | parent | prev | next [-]

What language/engine did you try it with for gamedev? Just curious if it was weak in a popular engine.

mchaver 12 hours ago | parent | prev | next [-]

> All AI has done for me is give me the power of how good search engines were 10+ years ago

So the good old days before search engines were drowning with ads and dark patterns. My assumption is big LLMs will go in the same direction after market capture is complete and they need to start turning a profit. If we are lucky the open source models can keep up.

demorro 11 hours ago | parent | prev | next [-]

It makes me wonder if the majority of all-in on AI folks are quite young and never experienced pre-enshittification search.

bandrami 10 hours ago | parent [-]

Also I see so much talk about "boilerplate" I can't help but wonder if people just never had decent text editors, or never bothered to customize them at all?

demorro 9 hours ago | parent [-]

Aye, I know. Don't get me wrong, I knew that the majority of devs have always been worse than useless, but it's been disconcerting to see quite how much value folk are getting out of agents for problems that have been solved for decades.

Arguably this solution is "better" because you don't even really need to understand that you have specific problems to have the agent solve them for you, but I fail to see the point of keeping these people employed in that case. If you haven't been able to solve your own workflow issues up until now I have zero trust in you being able to solve business problems.

joquarky 11 minutes ago | parent [-]

> the majority of devs have always been worse than useless

I disagree with "always".

This is only the recent wave of brogrammers who care nothing about the quality of the tech and are only in this industry for the gold rush.

They aren't inherently technically minded, they just know how to schmooze their way around and convince decision makers to follow capricious trends over solid practices.

throwaw12 19 hours ago | parent | prev | next [-]

> how good search engines were 10+ years ago

For me this is a huge boost in productivity. If I remember how I was working in the past (example of Google integration):

Before:

    * go through docs to understand how to start (quick start) and things to know
    * start boilerplate (e.g. install the scripts/libs)
    * figure out configs to enable in GCP console
    * integrate basic API and test
    * of course it fails, because its Google API, so difficult to work with
    * along the way figure out why Python lib is failing to install, oh version mismatch, ohh gcc not installed, ohh libffmpeg is required,...
    * somehow copy paste and integrate first basic API
    * prepare for production, ohhh production requires different type of Auth flow
    * deploy, redeploy, fix, deploy, redeploy
    * 3 days later -> finally hello world is working
Now:

    * Hey my LLM buddy, I want to integrate Google API, where do I start, come up with a plan
    * Enable things which requires manual intervention
    * In the meantime LLM integrates the code, install lib, asks me to approve installation of libpg, libffmpeg,....
    * test, if fails, feed the error back to LLM + prompt to fix it
    * deploy
noosphr 18 hours ago | parent [-]

This is what you'd use a search engine for 10 years ago.

The docs used to be good enough that there would be an example which did exactly what you needed more often than the llm gets it right today.

rhubarbtree 12 hours ago | parent | prev [-]

Are you using Claude Opus 4.5/6?

If not, then you’re not close to the cutting edge.

bandrami 11 hours ago | parent [-]

Until two weeks from now, at which point you'll be hopelessly obsolete. I've seen this treadmill before and am happy to let it settle down first.

thewebguyd a day ago | parent | prev | next [-]

Same here, more or less, in the ops world. Yeah, I use AI but I can't honestly say it's massively improved my productivity or drastically changed my job in any way other than the emails I get from the other managers at my work are now clearly written by AI.

I can turn out some scripts a little bit quicker, or find an answer to something a little quicker than googling, but I'm still waiting on others most of the time, the overall company processes haven't improved or gotten more efficient. The same blockers as always still exist.

Like you said, there has been other tech that has changed my job over time more than AI has. The move to the cloud, Docker, Terraform, Ansible, etc. have all had far more of an impact on my job. I see literally zero change in the output of others, both internally and externally.

So either this is a massively overblown bubble, or I'm just missing something.

linsomniac 19 hours ago | parent | next [-]

You're missing something.

I've been in ops for 30 years, Claude Code has changed how I work. Ops-related scripting seems to be a real sweet spot for the LLMs, especially as they tend to be smaller tools working together. It can convert a few sentences into working code in 15-30 minutes while you do something else. I've given it access to my apache logs Elastic cluster, and it does a great job at analyzing them ("We suspect this user has been compromised, can you find evidence of that?"). It's quite startling, actually, what it's able to do.

thewebguyd 19 hours ago | parent [-]

Yeah, it's useful for scripting, but it's still only marginally faster. It certainly hasn't been "groundbreaking productivity" like it's being sold.

The problem with analyzing logs is determinism. If I ask Claude to look for evidence of compromise, I can't trust the output without also going and verifying myself. It's now an extra step, for what? I still have to go into Elastic and run the actual queries to verify what Claude said. A saved Kibana search is faster, and more importantly, deterministic. I'm not going to leave something like finding evidence of compromise up to an LLM that can, and does, hallucinate especially when you fill the context up with a ton of logs.

An auditor isn't going to buy "But Claude said everything was fine."

Is AI actually finding things your SIEM rules were missing? Because otherwise, I just don't see the value in having a natural language interface for queries I already know how to run, it's less intuitive for me and non deterministic.

It's certainly a useful tool, there's no arguing that. I wouldn't want to go back to working with out it. But, I don't buy that it's already this huge labor market transformation force that's magically 100x everyone's productivity. That part is 100% pure hype, not reality.

bandrami 19 hours ago | parent | next [-]

The tolerance for indeterminacy is I think a generational marker; people ~20 years younger than me just kind of think of all software as indeterminate to begin with (because it's always been ridiculously complicated and event-driven for them), and it makes talking about this difficult.

sebmellen 18 hours ago | parent | next [-]

I shudder to think of how many layers of dependency we will one day sit upon. But when you think about it, aren’t biological systems kind of like this too? Fallible, indeterminable, massive, labyrinthine, and capable of immensely complex and awe inspiring things at the same time…

kiba 17 hours ago | parent | prev [-]

People younger than me are not even adults. I grew up during the dial up era and then the transition to broadband. I don't think software is indeterminate.

linsomniac 18 hours ago | parent | prev [-]

>still only marginally faster.

Is it? A couple days ago I had it build tooling for a one-off task I need to run, it wrote ~800 lines of Python to accomplish this, in <30m. I found it was too slow, so I got it to convert it to run multiple tasks in parallel in another prompt. Would have taken a couple days for me to build from hand, given the number of interruptions I have in the average day. This isn't a one-off, it's happening all the time.

keeda 21 hours ago | parent | prev | next [-]

> ... but I'm still waiting on others most of the time, the overall company processes haven't improved or gotten more efficient. The same blockers as always still exist.

And that's the key problem, isn't it? I maintain current organizations have the "wrong shape" to fully leverage AI. Imagine instead of the scope of your current ownership, you own everything your team or your whole department owns. Consider what that would do to the meetings and dependencies and processes and tickets and blockers and other bureaucracy, something I call "Conway Overhead."

Now imagine that playing out across multiple roles, i.e. you also take on product and design. Imagine what that would do to your company org chart.

I added a much more detailed comment here: https://news.ycombinator.com/item?id=47270142

applfanboysbgon 20 hours ago | parent | next [-]

> Imagine instead of

> Now imagine

> Imagine what that would do

Imagine if your grandma had wheels! She'd be a bicycle. Now imagine she had an engine. She could be a motorcycle! Unfortunately for grandma, she lives in reality and is not actually a motorcycle, which would be cool as hell. Our imagination can only take us so far.

To more substantively reply to your longer linked comment: your hypothesis is that people spend as little as 10% of time coding and the other 90% of time in meetings, but that if they could code more, they wouldn't need to meet other people because they could do all the work of an entire team themselves[1]. The problem with your hypothesis is that you take for granted that LLMs actually allow people to do the work of an entire team themselves, and that it is merely bureacracy holding them back. There have been absolutely zero indicators that this is true. No productivity studies of individual developers tackling tasks show a 10x speedup; results tend to be anywhere from +20% to minus 20%. We aren't seeing amazing software being built by individual developers using LLMs. There is still only one Fabrice Bellard in the world, even though if your premise could escape the containment zone of imagination anyone should be able to be a Bellard on their own time with the help of LLMs.

[1] Also, this is basically already true without LLMs. It is the reason startups are able to disrupt corporate behemoths. If you have just a small handful of people who spend the majority of their work time writing code (by hand! No LLMs required!), they can build amazing new products that outcompete products funded by trillion-dollar entities. Your observation of more coding = less meetings required in the first place has an element of truth to it, but not because LLMs are related to it in any particular way.

sgc 20 hours ago | parent | next [-]

     >  Imagine if your grandma had wheels! She'd be a bicycle.
I always took this to be a sharp jab saying the entire village is riding your grandma, giving it a very aggressive undertone. It's pretty funny nonetheless.

Too early to say what AI brings to the efficiency table I think. In some major things I do it's a 1000x speed up. In others it is more a different way of approaching a problem than a speed up. In yet others, it is a bit of an impediment. It works best when you learn to quickly recognize patterns and whether it will help. I don't know how people who are raised with ai will navigate and leverage it, which is the real long-term question (just as the difference between pre- and post-smartphone generations is a thing).

demorro 11 hours ago | parent [-]

1000x is ridiculous. What are you doing where that level of improvement is measurable. That means you are doing things that would have taken you a year of full-time work in less than half a day now.

EDIT: Retracted, I think the example given below is reasonably valid.

sgc 10 hours ago | parent [-]

I understand, but the improvement is actually more than that. It is not directly programming, but look at this page [1] for example. I spent years handcrafting parallel texts of English and Greek and had managed to put just under 400 books online. With AI, I managed to translate and put in parallel 1500 more books very quickly. At least 2/3 of those have never been translated into English, ever. That means I have done what the entire history of English-speaking scholars has never managed to do. And the quality is good enough that I have already had publishers contacting me to use the translations. There are a couple other areas where I am getting similar speed ups, but of course this is not the norm.

[1] https://catholiclibrary.org/library/browse/

demorro 10 hours ago | parent [-]

... you know what. Whilst I suspect the quality of these translations is probably not great. Fair play this is a valid example.

sgc 9 hours ago | parent [-]

Of course they are not perfect, but no translation is even close to perfect. The floor is actually incredibly low. All I can say is that many doctoral-level scholars, including myself and some academic publishers, find them to be somewhere between serviceable and better than average.

applfanboysbgon 7 hours ago | parent [-]

Knowing the quality of LLM translations between the two languages I speak, hearing it used like this by supposed academics invokes a deep despair in me. "Serviceable" is a flimsy excuse for mass-producing and publishing slop. Particularly given that slop will displace efforts to produce human translations, putting a ceiling on humanity's future output - no one will ever aspire to do better than slop, so instead of a few great translations, we'll get more slop than we would ever even want to read.

I guess it does depend on the languages involved; one study suggests that it's even worse than Google Translate for some languages, but maybe actually okay at English<-->Spanish?

> There were 132 sentences between the two documents. In Spanish, ChatGPT incorrectly translated 3.8% of all sentences, while GT incorrectly translated 18.1% of sentences. In Russian, ChatGPT and GT incorrectly translated 35.6% and 41.6% of all sentences, respectively. In Vietnamese, ChatGPT and GT incorrectly translated 24.2% and 10.6% of sentences, respectively.

https://jmai.amegroups.org/article/view/9019/html

sgc 6 hours ago | parent [-]

I wouldn't have put it online if I didn't think it was a major improvement over nothing. Realistically, if we haven't translated it in the last 500 years, there is no point for the next several hundred years of history to stick with nothing as well. It takes a bit more than pasting sentences in chatGPT to get a serviceable translation of course, but significantly better results than that are possible. I have not tried translating into other languages, but I am sure having English as the target language is a help.

It's all right there on my website in parallel text, everybody can check and come to their own conclusion rather than driving by with unhelpful generalizations. And really, that is the primary scope of these translations: as aids in reading an original text.

keeda 19 hours ago | parent | prev | next [-]

> No productivity studies of individual developers tackling tasks show a 10x speedup; results tend to be anywhere from +20% to minus 20%.

The only study showing a -20% came back and said, "we now think it's +9% - +38%, but we can't prove rigorously because developers don't want to work without AI anymore": https://news.ycombinator.com/item?id=47142078

Even at the time of the original study, most other rigorous studies showed -5% (for legacy projects, obsolete languages) to 30% (more typical greenfield AND brownfield projects) way back in 2024. Today I hear numbers up to 60% from reports like DX.

But this is exactly missing the point. Most of them are still doing things the old way, including the very process of writing code. Which brings me to this point:

> There have been absolutely zero indicators that this is true.

I could tell you my personal experience, or link various comments on HN, or point you to blogs like https://ghuntley.com/real/ (which also talks about the origanizational impedance mismatch for AI), but actual code would be a better data point.

So there are some open-source projects worth looking at, but they are typically dismissed because they look so weird to us. Here's two mostly vibe-coded (as in, minimal code review, apparently) projects that people shredded for having weird code, but is already used by 10s of 1000s of people, up to 11 - 18K stars now. Look at the commit volume and patterns for O(300K) LoC in a couple of months, mostly from one guy and his agent:

https://github.com/steveyegge/beads/graphs/commit-activity

https://github.com/steveyegge/gastown/graphs/commit-activity

It's like nothing we've seen before, almost equal number of LoC additions and deletions, in the 100s of Ks! It's still not clear how this will pan out long term, but the volume of code and apparent utility (based purely on popularity) is undeniable.

laserlight 12 hours ago | parent | next [-]

> we now think it's +9% - +38%

If you are referring to the following quote [0], you are off by a sign:

> we now estimate a speedup of -18% with a confidence interval between -38% and +9%.

[0] https://metr.org/blog/2026-02-24-uplift-update/

demorro 11 hours ago | parent | next [-]

That update blog is funny. The only data they can get at reports slowdowns, but they struggle to believe it because developers self-report amazing speedups.

You'd get the same sort of results if you were studying the benefits of substance abuse.

"It is difficult to study the downsides of opiates because none of our participants were willing to go a day without opiates. For this reason, opiates must be really good and we're just missing something."

keeda 7 hours ago | parent | prev [-]

My bad, I messed up by being lazy while switching from decreases in time taken (that they report) to increased in throughput. (Yes, it's not just flipping the sign, but as I said, I was being lazy!) The broad point still holds, their initial findings have been reversed, and they expect selection effects masked a higher speedup.

The language is confusing, but the chart helps: https://metr.org/assets/images/uplift-2026-post/uplift_timel...

applfanboysbgon 19 hours ago | parent | prev [-]

> they are typically dismissed because they look so weird to us.

I dismiss them because Yegge's work (if it can even be called his work, given that he doesn't look at the code) is steaming garbage with zero real-world utility, not "because they look weird". You suggest the apparent utility is undeniable, while saying "based purely on popularity" -- but popularity is in no way a measure of utility. Yegge is a conman who profited hundreds of thousands of dollars shilling a memecoin rugpull tied to these projects. The actual thousands of users are people joining the hypetrain, looking to get in on the promised pyramid scheme of free money where AI will build the next million dollar software for you, if only you have the right combination of .md files to make it work. None of these software are actually materialising, so all the people in this bubble can do is make more AI wrappers that promise to make other AI wrappers that will totally make them money.

I am completely open to being proven wrong by a vibe-coded open source application that is actually useful, but I haven't seen a single one. Literally not even one. I would count literally anything where the end-product is not an AI wrapper itself, which has tens to hundreds of thousands of users, and which was written entirely by agents. One example of that would be great. Just one. There have been a couple of attempts at a web browser, and Claude's C compiler, but neither are actually useful or have any real users; they are just proofs of concept and I have seen nothing that convinces me they are a solid foundation from which you could actually build useful software from, or that models will ever be on a trajectory to make them actually useful.

keeda 6 hours ago | parent [-]

The memecoin thing was stupid, totally. Yegge should never have touched it, because well, crypto, but also because that's a distraction from the actual project.

> popularity is in no way a measure of utility

Why would it be popular if it's not useful? Yegge is not like some superstar whose products are popular just because he made them. And while some people may be chasing dollars, most of them are building software that scratches an itch. (Search for Beads on GitHub, you'll find thousands of public repos, and lord knows how many private repos.)

Beads has certainly made my agents much more effective, even the older models. To understand its utility you have to do agentic coding for a while, see the stupid mistakes agents make because they forget everything, and then introduce Beads and see almost all those issues melt away.

> None of these software are actually materialising

They are if you look for them. There are many indications (often discussed here) showing spikes in apps on app stores, number of GitHub projects, and Show HN entries. Now, you may dismiss these as "not actually useful", and at this volume that's undoubtedly true for a lot of them.

But there is already early data showing growth not only in mobile app downloads, but also time spent per user and revenue -- which are pretty clear indications of utility: https://sensortower.com/blog/state-of-mobile-2026

Edit: it occurs to me that by "vibe-coding" we may be talking about two different things -- I tend to mean "heavily AI-assisted coding" whereas you likely mean "never look at the code YOLO coding." I'll totally agree that YOLO vibe-coded apps by non-experts will be crap. Other than Beads and Gastown I don't know of any such app that is non-trivial. But then those were steered by a highly experienced engineer, and my original point was, vibe-coding correctly could look very weird by today's best practices.

thewebguyd 3 hours ago | parent [-]

> I tend to mean "heavily AI-assisted coding" whereas you likely mean "never look at the code YOLO coding."

The original point that sparked this sub-thread though is that AI is being overhyped. If actual vibe coding (YOLO it, never look at or understand the code, thus truly enabling non-technical folk to have revolutionary power and ability) doesn't work, then AI is yet just another tool in the toolbelt like any other developer life enhancing tech we've had so far, it's just a new form of IDE.

Being a new form of IDE, while very useful, isn't exactly entire economy transforming revolutionary tech. If it can't be used by someone with zero computer/eng experience to build something useful and revenue generating, the amount of investment we've seen into it is way overblown and is well overdue for a pretty severe correction.

I buy AI as a "developer enhancing tool" just like any other devtools that we've seen over my career. I don't currently buy it as a "total labor economy transformation force."

pishpash 20 hours ago | parent | prev [-]

This isn't the counter you think it is. It's too much to expect existing behemoths to reshape their orgs substantially on a quick enough timeline. The gains will be first seen in new companies and new organizations, and they will be able to stay flat a longer and outcompete the behemoths.

sdf2df 21 hours ago | parent | prev [-]

What a load of fluff lmao. Are you Nadella?

keeda 19 hours ago | parent [-]

Hah! I would say I'm flattered, but I find his style of speaking rather stilted.

a day ago | parent | prev | next [-]
[deleted]
tayo42 18 hours ago | parent | prev | next [-]

Ops hasn't been in the crosshairs of Ai yet.

Imo it's only a matter of time as companies start to figure out how to use ai. Companies don't seem to have real plans yet and everyone is figuring out ai in general out.

Soon though I will think agents start popping up, things like first line response to pages, executing automation

bandrami 18 hours ago | parent [-]

We've had deterministic automation of tier one response for over a decade now. What value would indeterminacy add to that?

tayo42 18 hours ago | parent [-]

To deal with the problems where there is ambiguity in the problem and the approach to solving it. Not everything is a basic decision tree. Humans aren't deterministic either, the way we woukd approach a problem is probably different. Is one of us right or wrong? We're generally just focused on end results.

Maybe 2 years ago Ai was doing random stuff and we got all those funny screenshots of dumb gemini answers. The indeterminism leading to random stuff isn't really an issue any more.

The way it thinks keeps it on track.

bandrami 16 hours ago | parent [-]

Two weeks ago I asked a frontier model to list five mammals without "e" in their name and number four was "otter"

tayo42 8 hours ago | parent [-]

Is identifying mammals without the letter E part of your ops work flow?

Opus 4.6 didn't have an issue with this question though.

thewebguyd 7 hours ago | parent [-]

> Is identifying mammals without the letter E part of your ops work flow?

No, but it can show unreliability for adjacent tasks. Identifying a CIDR block in traffic logs is a normal part of an ops work flow. It means it's more likely to fail if you need to generate a complex Regex to filter PII from a terabyte of logs. If the model has a blind spot for specific characters because it tokenizes words instead of seeing individual characters, then it can miss a critical path of failure because the service name didn't fit its probabilistic training.

Maybe you need to boilerplate Terraform. If the model can't reliably (reliably, as in, 100% deterministic, does this without fail) parse constraints, it's not just a funny mistake it's a potential 5 figure billing error.

Ops can't run on "mostly accurate." That's just simply not good enough. We need deterministic precision.

For AI to be useful in this world to the extent others have claimed it is for software eng, we'll likely need more advanced world models, not just something that can predict the next most likely token.

tayo42 5 hours ago | parent [-]

Your terraform written by a person already doesn't have deterministic precision. Ai isn't messing these things up either.

If your Ai work flow is still dumping logs into a chat and saying search it for some pattern, then you should see what something like Claude code approaches problems. These agents aren't building scripts to solve problems. Which is your deterministic solution.

thewebguyd 5 hours ago | parent | next [-]

That still only just makes it a force multiplier for engineers, like any other tech, not a replacement as it's being hyped and sold as.

Claude resorting to writing code for everything, because that's all the model can do without too many hallucinations and context poisoning, is just a higher speed REPL. Great, that's useful.

But that's not what is being hyped and sold. What's being hyped and sold is "You don't need an Ops guy anymore, just talk to the computer." Well, what happens when the AI decides the "fix" is to just open up 0.0.0.0/0 to the world to make the errors go away? The non technical minimum wage person now just talking to the computer has no idea they just pwned the company.

If AI's answer is "Just write a script to solve the prompt" then you still need technical people, and it's vasly over hyped.

I'll be interested when you actually can just dump logs in a chat and analyze it without the model having to resort to writing code to solve the problem. That will be revolutionary. Imagine all the time I'd save by not having to make business reports, I can just tell the business people to point AI at terabytes of CSV exports and just ask it questions. That is when it will stop just being labor compression for existing engineers, and start being a world changing paradigm shift.

For now, it's just yet another tool in my toolbelt.

tayo42 2 hours ago | parent [-]

Not sure why the implementation is important or not. The point is the system will be triggered by some text input and complete the task asynchronously on its own.

bandrami 2 hours ago | parent | prev [-]

> Your terraform written by a person already doesn't have deterministic precision

Can you expand on that? Because it sure seems to me like it is in fact deterministic unless the person deliberately made it otherwise

tayo42 2 hours ago | parent [-]

If i give you a task to write terraform or any code, you won't write what I write, you probably won't even write the same thing twice. You can introduce a bug too, we're not perfect. The output of the task "write some terraform" already isn't deterministic when dealing with people.

sdf2df a day ago | parent | prev [-]

Youre not missing anything.

Humans are funny. But most cant seem to understand that the tool is a mirage and they are putting false expectations on it. E.g. management of firms cutting back on hiring under the expectation that LLMs will do magic - with many cheering 'this is the worst itll be bro!!".

I just hope more people realise before Anthropic and OAI can IPO. I would wager they are in the process of cleaning up their financials for it.

httpz 20 hours ago | parent | prev | next [-]

This is a classic case of Productivity Paradox when personal computers were first introduced into workplaces in the 80s.

A famous economist once said, "You can see the computer age everywhere but in the productivity statistics."

There are many reasons for the lag in productivity gain but it certainly will come.

https://en.wikipedia.org/wiki/Productivity_paradox

bandrami 20 hours ago | parent | next [-]

That's only certain if investments in tech infrastructure always led to productivity increases. But sometimes they just don't. Lots of firms spent a lot of money on blockchain five years ago, for instance, and that money is just gone now.

20k 19 hours ago | parent | next [-]

I find it odd the universal assumption that AI is going to be good for productivity

The loss of skills, complete loss of visibility and experience with the codebase, and the complete lack of software architecture design, seems like a massive killer in the long term

I have a feeling that we're going to see productivity with AI drop through the floor

hombre_fatal 19 hours ago | parent | next [-]

I'd claim the opposite. Better models design better software, and quickly better software than what most software developers were writing.

Just yesterday I asked Opus 4.6 what I could do to make an old macOS AppKit project more testable, too lazy to even encumber the question with my own preferences like I usually do, and it pitched a refactor into Elm architecture. And then it did the refactor while I took a piss.

The idea that AI writes bad software or can't improve existing software in substantial ways is really outdated. Just consider how most human-written software is untested despite everyone agreeing testing is a good idea simply because test-friendly arch takes a lot of thought and test maintenance slow you down. AI will do all of that, just mention something about 'testability' in AGENTS.md.

bandrami 19 hours ago | parent | next [-]

OK so this comes back to the question I started this subthread with: where is this better software? Why isn't someone selling it to me? I've been told for a year it's coming any day now (though invariably the next month I'm told last month's tools were in fact crap and useless compared to the new generation so I just have to wait for this round to kick in) and at some point I do have to actually see it if you expect me to believe it's real.

hombre_fatal 18 hours ago | parent | next [-]

How would you know if all software written in the last six months shipped X% faster and was Y% better?

Why would you think you have your finger on the pulse of general software trends like that when you use the same, what, dozen apps every week?

Just looking at my own productivity, as mere sideprojects this month, I've shipped my own terminal app (replaced iTerm2), btrfs+luks NAS system manager, overhauled my macOS gamepad mapper for the app store, and more. All fully tested and really polished, yet I didn't write any code by hand. I would have done none of that this month without AI.

You'd need some real empirics to pick up productivity stories like mine across the software world, not vibes.

bandrami 18 hours ago | parent | next [-]

Right, I'm sympathetic to the idea that LLMs facilitate the creation of software that people previously weren't willing to pay for, but then kind of by definition that's not going to have a big topline economic impact.

littlexsparkee 2 hours ago | parent | next [-]

Well, we don't know - that's capturing 2 scenarios: software that whose impact is low as reflected by lack of investment and legitimately useful improvements that just weren't valued (fix slow code, reduce errors and increase uptime, address security concerns) because the cost was not appreciated / papered over by patches / company hasn't been bitten yet

hombre_fatal 10 hours ago | parent | prev [-]

Why did you add that "weren't willing to pay for" condition?

Most of the software I replaced was software I was paying for (iStat Menus, Wispr Flow, Synology/Unraid). That I was paying for a project I could trivially take on with AI was one of the main incentives to do it.

Tanjreeve 18 hours ago | parent | prev [-]

It's on the people pushing AI as the panacea that has changed things to show workings. Not someone saying "I've not seen evidence of it". Otherwise it's "vibes" as you put it.

eucyclos 17 hours ago | parent | prev [-]

Here's an example: https://eudaimonia-project.netlify.app/

I'm happy to sell it to you, though it is also free. I guided Claude to write this in three weeks, after never having written a line of JavaScript or set up a server before. I'm sure a better JavaScript programmer than I could do this in three weeks, but there's no way I could. I just had a cool idea for making advertising a force for good, and now I have a working version in beta.

I'd say it is better software, but better is doing a lot of heavy lifting there. Claude's execution is average and always will be, that's a function of being a prediction engine. But I genuinely think the idea is better than how advertising works today, and this product would not exist at all if I had to write it myself. And I'm someone who has written code before, enough that I was probably a somewhat early adopter to this whole thing. Multiply that by all the people whose ideas get to live now, and I'm sure some ideas will prove to be better even with average execution. Like an llm, that's a function of statistics.

bandrami 17 hours ago | parent [-]

In glad you made something with it you wanted to make, and as a fan of Aristotle I'm always happy to see the word eudaimonia out there. Best of luck. That said I don't understand what this does or why I would want the tokens it mentions.

eucyclos 17 hours ago | parent [-]

Yeah, I gotta make a video walkthrough. Its basically a goal tracker combined with an ad filter - write what you want out of life and block ads, it replaces them with ads that actually align with your long term goals instead of distracting from them. The tokens let you add ads to the network, though you also get some for using the goal tracker.

bandrami 16 hours ago | parent | next [-]

Though this does suggest one possible answer to me: the new software is largely web applications, and the web is just a space I don't spend much time anymore other than a few retro sites like this

tasuki 11 hours ago | parent | prev [-]

No, you don't need a video walkthrough. You need that damn web page to explain – in plain language – what this is and what it's good for.

demorro 10 hours ago | parent | next [-]

They can't, they never did the work to discover what it's good for because they skipped over implementation and concept validation.

This concept will never work outside of their own head. People continue to think producing something is the hard part my word.

eucyclos 10 hours ago | parent | prev | next [-]

Would the above explanation be better? The website is there because stripe needs a landing page and the text is there because I'm trying to communicate the aspiration the instantiation I can always explain in detail if someone wants to hear how that would work.

tasuki 3 hours ago | parent [-]

> Would the above explanation be better?

No idea. I certainly didn't get it. Goal tracker is one thing, ad blocker is another thing. Why would I want to combine them? And why would I want to see any ads at all? Perhaps I'm just not the target audience...

eucyclos a minute ago | parent [-]

Maybe not, but you might want to see ads because 1) they fund a huge part of the free internet and 2) if they were targeted not at what you're most likely to buy today but at what would most help you achieve goals you'r struggling with, they'd be a constant source of useful information and motivation as you go about your day. That's the part that seems obvious to me but I have a hard time communicating.

KellyCriterion 8 hours ago | parent | prev [-]

++1

I didnt get it either on first glance when scrolling down the whole page

eucyclos 4 hours ago | parent [-]

Wow, that us useful feedback, thanks! I'll update that this weekend.

20k 15 hours ago | parent | prev [-]

And now you have no idea how any of the code works

AI writes bad software by virtue of it being written by the AI, not you. No actual team member understands what's going on with the code. You can't interrogate the AI for its decision making. It doesn't understand the architecture its built. There's nobody you can ask about why anything is built the way it is - it just exists

Its interesting watching people forget that the #1 most important thing is developers who understand a codebase thoroughly. Institutional knowledge is absolutely key to maintaining a codebase, and making good decisions in the long term

Its always been possible to trade long term productivity for short term gains like this. But now you simply have no idea what's going on in your code, which is an absolute nightmare for long term productivity

hombre_fatal 6 hours ago | parent | next [-]

You can read as much or as little of the code as you want.

The status quo was that I have no better understanding of code I haven't touched in a year, or code built by other people. Now I have the option to query the code with AI to bootstrap my understanding to exactly the level necessary.

But you're wrong on every claim about LLM capabilities. You can ask the AI exactly why it decided on a given design. You can ask it what the best options were and why it chose that option. You can ask it for the trade-offs.

In fact, this should be part of your Plan feedback loop before you move to Implementation.

20k 5 hours ago | parent [-]

You can ask the AI why, but its answer doesn't come from any kind of genuine reasoning. It doesn't know why it did anything, because it doesn't exist as a sentient being. It just makes something up that sounds good

If you choose to take AI reasoning at face value, you're choosing to accept pretty strong technical debt

mirsadm 13 hours ago | parent | prev [-]

My own observation is that the initial boost to productivity results in massive crippling technical debt.

nikkwong 19 hours ago | parent | prev [-]

Having the productivity "drop through the floor" is a bit hyperbolic, no? Humans are still reviewing the PRs before code merge at least at my company (for the most part, for now).

bandrami 19 hours ago | parent [-]

I don't know that it's likely but it's certainly a plausible outcome. If tooling keeps getting built for this and the financial music stops it's going to take a while for everybody to get back up to speed

Remember this famously happened before, in the 1970s

18 hours ago | parent | next [-]
[deleted]
Tanjreeve 18 hours ago | parent | prev [-]

There's an actual working product now, albeit one which is currently loss leading. In software world at least there is definitely enough value for it to be used even if it's just better search engine. I'm not sure why it would disappear if the financial music stops as opposed to being commoditised.

bandrami 18 hours ago | parent [-]

Because there's cheaper ways to get an equally good search engine? But yes I imagine some amount of inference will continue even in an AI Winter 3.0 scenario.

salawat 18 hours ago | parent | prev [-]

Ironically, abstraction bloat eats away any infra gains. We trade more compute to allow people less in tune with the machine to get things done, usually at the cost of the implementation being eh... Suboptimal, shall we say.

bandrami 16 hours ago | parent [-]

I think there's a broad category error where people see that every gain has been an abstraction (true) but conclude from that that every abstraction will be a gain (dubious)

kranner 19 hours ago | parent | prev | next [-]

> There are many reasons for the lag in productivity gain but it certainly will come.

Predictions without a deadline are unfalsifiable.

KellyCriterion 8 hours ago | parent [-]

Well the thing with predictions is that they are in genral difficult - esp. when it comes to those in future :-D

danbolt 15 hours ago | parent | prev [-]

My unfounded hunch for the computing bit is that home computers became more and more commonplace in the home as we approached the 21st century.

A Commodore 64 was a cool gadget, but “the family computer” became a device that commoditized the productivity. The opportunity cost of applying a computer to try something new went to near zero.

It might have been harder for someone to improve the productivity of an old factory in Shreveport, Louisiana with a computer than it was for the upstarts at id to make Doom.

fnordpiglet 19 hours ago | parent | prev | next [-]

My employer is pretty advanced in its use of these tools for development and it’s absolutely accelerated everything we do to the point we are exhausting roadmaps for six months in a few weeks. However I think very few companies are operating like this yet. It takes time for tools and techniques to make it out and Claude code alone isn’t enough. They are basically planning to let go of most of the product managers and Eng managers, and I expect they’re measuring who is using the AI tools most effectively and everyone else will be let go, likely before years end. Unlike prior iterations I saw at Salesforce this time I am convinced they’re actually going to do it and pull it off. This is the biggest change I’ve seen in my 35 year career, and I have to say I’m pretty excited to be going through it even though the collateral damage will be immense to peoples lives. I plan to retire after this as well, I think this part is sort of interesting but I can see clearly what comes next is not.

p1esk 19 hours ago | parent | next [-]

I’m observing very similar trends at a startup I’m at. Unfortunately I’m not ready to retire yet.

blackcatsec 18 hours ago | parent | prev [-]

Why are you excited for this? They’re not going to give YOU those peoples’ salaries. You will get none of it. In fact, it will drag your salary through the floor because of all the available talent.

fnordpiglet 7 hours ago | parent | next [-]

I’m excited as a computer scientist to see it happening in my life time. I am not excited for the consequences once it’s played out. Hence my comment about retiring, and empathy for everyone who is still around once I do. I never got into this for the money - when I started engineers made about as much as accountants. It’s only post 1997 or so that it became “cool” and well paid. I am doing this because I love technology and what it can do and the science of computing. So in that regard it’s an amazing time to be here. But I am also sad to see the black box cover the beauty of it all.

Karrot_Kream 15 hours ago | parent | prev [-]

I'm very confused about this. Salary is only one portion of your total compensation. The vast majority of tech companies offer equity in a company. The two ways to increase the FMV of your equity is: increase your equity stake or increase the value of the total equity available. Hitting the same goals with fewer people means your run rate is lower, which increases the value of your equity (the FMV prices in lower COGS for the same revenue.) Also, keeping on staff often means you want to offer them increased equity stakes as an employment package. Letting staff go means more of that available equity pool is available to distribute to remaining employees.

We aren't fungible workers in a low skill industry. And if you find yourself working in a tech company without equity: just don't, leave. Either find a new tech company or do something else altogether.

camdenreslink 6 hours ago | parent [-]

Equity is negotiable just like salary, and if supply of developer labor increases with the same or less demand, you'll get less equity just like you will get less salary.

steve_adams_86 6 hours ago | parent | prev | next [-]

In my org I get far more done than ever, but I also find it more exhausting.

Because I can get so much done, I've lost my sense for what's enough. And if I can squeeze out a bit more relatively easily, why wouldn't I? When do I hit the brakes?

There are some tasks where LLMs are not all that helpful, and I find myself kind of savoring those tasks.

I'm surprised you don't notice a difference. Where I work it has been transformative. Perhaps it's because we're relatively small and scrappy, so the change in pace is easier with less organizational inertia. We've dramatically changed processes and increased outputs without a loss in quality. For less experienced programmers who are more interested in simple scripts for processing data, their outputs are actually far better, and they're learning faster because the Claude Code UI exposes them to so many techniques in the shell. I now see people using bash pipes for basic operations who wouldn't have known a thing about bash a couple years ago. The other day a couple less-technical people came to me to learn about what tests are. They never would have been motivated to learn this before. It's really cool.

It doesn't reduce work at all, though. We're an under-funded NGO with high ambition. These changes allow us to do more with the same funding. Hopefully it allows us to get more funding, too. I can't see it leading to anyone being let go; we need every brain we can get.

bandrami a day ago | parent | prev | next [-]

The dev team is committing more than they used to. A lot, in fact, judging from the logs. But it's not showing up as a faster cadence of getting me software to administer. Again, maybe that will change.

whateveracct 20 hours ago | parent | next [-]

I think they feel more productive but aren't actually.

righthand a day ago | parent | prev [-]

In my experience it is now twice the amount of merge requests as a follow-up appears to correct any bugs no one reviewed in the first merge request.

silentkat 21 hours ago | parent [-]

I’m at a big tech company. They proudly stated more productivity measures in commits (already nonsense). 47% more commits, 17% less time per commit. Meaning 128% more time spent coding. Burning us out and acting like the AI slop is “unlocking” productivity.

There’s some neat stuff, don’t get me wrong. But every additional tool so far has started strong but then always falls over. Always.

Right now there’s this “orchestrator” nonsense. Cool in principle, but as someone who made scripts to automate with all the time before it’s not impressive. Spent $200 to automate doing some bug finding and fixing. It found and fixed the easy stuff (still pretty neat), and then “partially verified” it fixed the other stuff.

The “partial verification” was it justifying why it was okay it was broken.

The company has mandated we use this technology. I have an “AI Native” rating. We’re being told to put out at least 28 commits a month. It’s nonsense.

They’re letting me play with an expensive, super-high-level, probabilistic language. So I’m having a lot of fun. But I’m not going to lie, I’m very disappointed. Got this job a year ago. 12 years programming experience. First big tech job. Was hoping to learn a lot. Know my use of data to prioritize work could be better. Was sold on their use of data. I’m sure some teams here use data really well, but I’m just not impressed.

And I’m not even getting into the people gaming the metrics to look good while actually making more work for everyone else.

sdf2df 21 hours ago | parent | next [-]

Lol its gonna take longer than it should for this to play out.

Sunk cost fallacy is very real, for all involved. Especially the model producers and their investors.

Sunk cost fallacy is also real for dev's who are now giving up how they used to work - they've made a sunk investment in learning to use LLMs etc. Hence the 'there's no going back' comments that crop up on here.

As I said in this thread - anyone who can think straight - Im referring to those who adhere to fundamental economic principles - can see what's going on from a mile away.

booleandilemma 19 hours ago | parent | prev [-]

Management is just stupid sometimes. We had a similar metric at my last company and my manager's response was "well how else are we supposed to measure productivity?", and that was supposed to be a legitimate answer.

sdf2df 6 hours ago | parent [-]

The benefits of AI either accrue toward incremental revenue-generation or cost-saving.

Its not rocket science to measure actually. The issue is most people dont know how to think properly to invent the right proxies.

redhale 12 hours ago | parent | prev | next [-]

I don't doubt your sincerity. But this represents an absolutely bonkers disparity compared to the reality I'm experiencing.

I'm not sure what to say. It's like someone claiming that automobiles don't improve personal mobility. There are a lot of logical reasons to be against the mass adoption of automobiles, but "lack of effectiveness as a form of personal mobility" is not one of them.

Hearing things like this does give me a little hope though, as I think it means the total collapse of the software engineering industry is probably still a few years away, if so many companies are still so far behind the curve.

tasuki 11 hours ago | parent [-]

> It's like someone claiming that automobiles don't improve personal mobility.

I prefer walking or cycling and often walk about 8km a day around town, for both mobility and exercise. (Other people's) automobiles make my experience worse, not better.

I'm sure there's an analogy somewhere.

(Sure, automobiles improve the speed of mobility, if that's the only thing you care about...)

ygouzerh 11 hours ago | parent | prev | next [-]

I feel that it differs a lot between companies. It seems like corporate are having less an impact for now, as external innovation needs tailoring to adapt to its needs (e.g a security solution that needs 3 month projects to be tailored to the company tech stack), whereas startups and smaller firms see the most of the impact so far.

eucyclos 19 hours ago | parent | prev | next [-]

A tool with a mediocre level of skill in everything looks mediocre when the backdrop is our own area of expertise and game changing when the backdrop is an unfamiliar one. But I suspect the real game changer will be that everyone is suddenly a polymath.

sibeliuss 19 hours ago | parent [-]

This ^ Exactly it. This will be the change.

rhubarbtree 12 hours ago | parent | prev | next [-]

This is not a good thing. If you’re not being exposed and skilling up already, you’re likely to be in the camp that is washed away.

If you can’t be exposed to it in your day job, start using Claude opus in the evening so you know what’s coming.

matkoniecz 11 hours ago | parent [-]

So far I have not seen much skill gain from using LLM extensively.

Maybe I will be replaced by matrix multiplication in my job, but if I need to use LLM at some point I expect little benefit from starting now.

Yes, I tried to use Claude Code two months ago. It was scary, but not useful.

rhubarbtree 11 hours ago | parent [-]

“Not useful” —- one of those moments where you have to be able to adjust your views in the face of new evidence. Humans are so wedded to their beliefs that it can be agonising to let go. I have nothing but respect for people who admit they were wrong, though. I remained a skeptic for a long time, but 4.5 was enough to convince me to adopt for production code.

matkoniecz 10 hours ago | parent [-]

So far I updated from "meh, not useful" to "scary, not useful".

If it would be useful I would continue to use it, but at this point I would not use even if it would be free, not proprietary and not funding replacing me.

lovich 21 hours ago | parent | prev | next [-]

> so far AI has had almost no impact on my job.

Are you hiring?

LPisGood 19 hours ago | parent | next [-]

My company has been hiring a ton over the last year or so. Jobs are out there

cute_boi 20 hours ago | parent | prev [-]

My friend used to say that, and he got quietly fired and outsourced because now someone in India can use ChatGPT to produce similar code, lol.

IMO AI will make 70-80% job obsolete for sure.

bandrami 19 hours ago | parent | next [-]

But, as I said above, I don't produce code; I administer it (administrate? whichever it is).

leptons 16 hours ago | parent | prev [-]

>now someone in India can use ChatGPT to produce similar code,

lol, that sounds like a disaster for the codebase.

sdf2df a day ago | parent | prev | next [-]

I will personally say right now... its not gonna change lol.

People who actually know how to think can see it a mile away.

stevenhuang 20 hours ago | parent [-]

It's telling you feel the need to create a throw away to voice this opinion.

sdf2df 20 hours ago | parent [-]

1) Not a throaway, can't remember what my old account is called 2) Feel free to screen shot. Stick it on your desktop and set a reminder and check the state of the world in 12 months time.

Job done fella.

jaxn 19 hours ago | parent | next [-]

For some of us, the world has already changed drastically. I am shipping more code, better code, less buggy code WAY faster than ever before. Big systemic changes for the better to our infra as well. There are days where I easily do 2 weeks worth of my best work ever.

I totally understand that not everyone is having that experience. And yet until people live it, it seems they just discount the experience others are having.

I'll take the 12 month bet.

leptons 16 hours ago | parent | next [-]

>I am shipping more code, better code, less buggy code WAY faster than ever before.

It's clearly relative. For all we know you're a crap coder and AI is now your crutch. We have no evidence that with AI you are as good as an average developer with a fair amount of experience. And even if you do have a fair amount of experience, that doesn't mean you're a good coder.

sdf2df 6 hours ago | parent [-]

Exactly lol.

The iPod project was done in months, not years. Im convinced most people aren't as good at programming / focusing on the right stuff as they claim.

salawat 17 hours ago | parent | prev [-]

Cool, and you're doing it on top of the single largest IP hijacking in the history of the world, a massive uptick in infra spend and energy burn to "just throw more compute" at it instead of figuring out how to throw "the right compute at it", cannibalization of the onboarding graduates, and losing having enough friction to keep you from running off after what's probably a bad idea on further analysis, because you can crank this out in a weekend. Last time somewhat did that, we got fucking JS. We still haven't rid ourselves of it.

Let us not lose sight of how we got here.

stevenhuang 20 hours ago | parent | prev [-]

12 months I won't be surprised if there's not much change. But in 5 years? 10? Anything can happen. It is presumptuous to think you can project the future capabilities of this technology and confidently state that labour markets will never be affected.

sdf2df 20 hours ago | parent [-]

You prove my point.

Guys like you dont get it. You think OAI, Amazon etc can freely put large amounts of money into this for 5-10 years? Lmao - delusional. Investors are impatient. Show huge jumps in revenue this year or you no longer have permission to put monumental amounts of money into this anymore.

Short of that they'll just destroy the stock price by selling off; leaving employees who get paid via SBC very unhappy.

dolebirchwood 19 hours ago | parent | next [-]

> You think OAI, Amazon etc can freely put large amounts of money into this for 5-10 years?

Won't matter. The Chinese models will be running on potatoes by then and be better than ever.

sdf2df 5 hours ago | parent [-]

By the time that's obvious investors in the market would've priced that in. Again repeating myself here.

HWR_14 17 hours ago | parent | prev | next [-]

Whatever you want to say about other companies, Amazon (and Meta) is quite willing to spend many years pouring billions into technology they think will pay off later.

Ekaros 15 hours ago | parent [-]

Looking at VR and Meta. They absolutely can be wrong. So even after investing what seems to be enough, there might not be any payoff.

sdf2df 6 hours ago | parent [-]

And the investors were correct to crush the stock price down to 90-odd dollars. Which finally forced Zuck to face the music.

This place is full of bozos.

stevenhuang 5 hours ago | parent | prev | next [-]

Whether investors will see returns soon enough to service their debt loads is entirely another matter. I do agree the likely course of action is we get a crash of sorts, since the only way their investments pan out is if labour is replaced entirely which of course sounds unlikely in near term.

My point is the cat is out of the bag. It doesn't take massive investments to achieve iterative improvements on SOTA. As long as the technology does not plateau, smaller labs have shown it's possible to advance the frontiers independent of large companies/investments. And as these frontiers advance, more and more of economical knowledge work will be subsumed by AI. I don't see a way out of this, which is why I am a strong proponent of wealth distribution eg UBI.

greyw 19 hours ago | parent | prev [-]

Such are reductive and superficial way of thinking on how investments works. Makes me confident you dont really are able to make a good prediction

sdf2df 6 hours ago | parent [-]

Lol okay, show me your portfolio. Ive beaten the market after-adjusting for risk for years on-end.

20 hours ago | parent | prev | next [-]
[deleted]
willmadden a day ago | parent | prev | next [-]

Build a new feature. If you aren't bogged down in bureaucracy it will happen much faster.

YesBox 19 hours ago | parent | next [-]

I dont use LLMs much. When I do, the experience always feels like search 2.0. Information at your fingertips. But you need to know exactly what you're looking for to get exactly what you need. The more complicated the problem, the more fractal / divergent outcomes there are. (Im forming the opinion that this is going to be the real limitations of LLMs).

I recently used copilot.com to help solve a tricky problem for me (which uses GPT 5.1):

   I have an arbitrary width rectangle that needs to be broken into smaller 
   random width rectangles (maintaining depth) within a given min/max range. 
The first solution merged the remainder (if less than min) into the last rectangle created (regardless if it exceeded the max).

So I poked the machine.

The next result used dynamic programming and generated every possible output combination. With a sufficiently large (yet small) rectangle, this is a factorial explosion and stalled the software.

So I poked the machine.

I realized this problem was essentially finding the distinct multisets of numbers that sum to some value. The next result used dynamic programming and only calculated the distinct sets (order is ignored). That way I could choose a random width from the set and then remove that value. (The LLM did not suggest this). However, even this was slow with a large enough rectangle.

So I poked my brain.

I realized I could start off with a greedy solution: Choose a random width within range, subtract from remaining width. Once remaining width is small enough, use dynamic programming. Then I had to handle the edges cases (no sets, when it's okay to break the rules.. etc)

So the LLMs are useful, but this took 2-3 hours IIRC (thinking, implementation, testing in an environment). Pretty sure I would have landed on a solution within the same time frame. Probably greedy with back tracking to force-fit the output.

gilbetron 9 hours ago | parent | next [-]

I just tested this with Claude Code and Opus 4.6, with the following prompt:

"I have an arbitrary width rectangle that needs to be broken into smaller random width rectangles (maintaining depth) within a given min/max range. The solution needs to be highly performant from an algorithmic standpoint, well-tested using TDD and Red/Green testing, written in python, and not have any subtle errors."

It got the answer you ended up with (if I'm understanding you correctly) the first time in just over 2 minutes of working, and included a solid test suite examining edge cases and with input validation.

YesBox 8 hours ago | parent [-]

How can we verify if you dont post the code?

I appreciate you testing, even though it's not a great comparison:

- My feedback cycle of LLM prompting forced me to be more explicit with each call, which benefited your prompt since I gave you exactly what to look for with fewer nuances.

- Maybe GPT 5.1 is old or kneecapped for newer versions of GPT

- Maybe Opus/Claud is just a way better model :P

Please post the code!

Edit: Regarding "exactly what to look for", when solving a new problem, rarely is all the nuance available for the first iteration.

redhale 12 hours ago | parent | prev [-]

> I don't use LLMs much

Sorry to be so blunt, but it's not surprising that you aren't able to get much value from these tools, considering you don't use them much.

Getting value from LLMs / agents is a skill like any other. If you don't practice it deliberately, you will likely be bad at it. It would be a mistake to confuse lack of personal skill for lack of tool capability. But I see people make this mistake all the time.

YesBox 10 hours ago | parent [-]

Would be helpful if you pointed out what I did wrong :).

If it's "you didn't explain the problem clearly enough", then that aligns with my original comment.

windward 8 hours ago | parent [-]

If you ask the chatbot for best practices it will tell you, including that you don't use a chatbot.

bandrami a day ago | parent | prev | next [-]

Most of these are new features, but then they have to integrate with the existing software so it's not really greenfield. (Not to mention that our clients aren't getting any faster at approving new features, either.)

willmadden a day ago | parent [-]

Did you train a self-hosted/open source LLM on your existing software and documentation? That should make it far more useful. It's not claude code, but some of those models are 80% there. In 6 months they'll be today's claude code.

bandrami a day ago | parent [-]

What would that help us with?

willmadden 8 hours ago | parent [-]

The LLM needs to understand your existing codebase if it's going to be useful building features that integrate with said codebase seamlessly without breaking things or assuming things that don't exist. That's not something you want to give away to a private AI company, so self-host an open source model.

sdf2df a day ago | parent | prev [-]

Its this kind of thinking that tells me people cant be trusted with their comments on here re. "Omg I can produce code faster and it'll do this and that".

No simply 'producing a feature' aint it bud. That's one piece of the puzzle.

Kye a day ago | parent | prev [-]

I've taken to calling LLMs processors. A "Hello World" in assembly is about 20 lines and on par with most unskilled prompting. It took a while to get from there to Rust, or Firefox, or 1T parameter transformers running on powerful vector processors. We're a notch past Hello World with this processor.

The specific way it applies to your specific situation, if it exists, either hasn't been found or hasn't made its way to you. It really is early days.