Remix.run Logo
AI and the ironies of automation – Part 2(ufried.com)
194 points by BinaryIgor 10 hours ago | 83 comments
ripe 7 hours ago | parent | next [-]

I really like this author's summary of the 1983 Bainbridge paper about industrial automation. I have often wondered how to apply those insights to AI agents, but I was never able to summarize it as well as OP.

Bainbridge by itself is a tough paper to read because it's so dense. It's just four pages long and worth following along:

https://ckrybus.com/static/papers/Bainbridge_1983_Automatica...

For example, see this statement in the paper: "the present generation of automated systems, which are monitored by former manual operators, are riding on their skills, which later generations of operators cannot be expected to have."

This summarizes the first irony of automation, which is now familiar to everyone on HN: using AI agents effectively requires an expert programmer, but to build the skills to be an expert programmer, you have to program yourself.

It's full of insights like that. Highly recommended!

yannyu 6 hours ago | parent | next [-]

I think it's even more pernicious than the paper describes as cultural outputs, art, and writing aren't done to solve a problem, they're expressions that don't have a pure utility purpose. There's no "final form" for these things, and they change constantly, like language.

All of these AI outputs are both polluting the commons where they pulled all their training data AND are alienating the creators of these cultural outputs via displacement of labor and payment, which means that general purpose models are starting to run out of contemporary, low-cost training data.

So either training data is going to get more expensive because you're going to have to pay creators, or these models will slowly drift away from the contemporary cultural reality.

We'll see where it all lands, but it seems clear that this is a circular problem with a time delay, and we're just waiting to see what the downstream effect will be.

hannasanarion 6 hours ago | parent | next [-]

> All of these AI outputs are both polluting the commons where they pulled all their training data AND are alienating the creators of these cultural outputs via displacement of labor and payment

No dispute on the first part, but I really wish there were numbers available somehow to address the second. Maybe it's my cultural bubble, but it sure feels like the "AI Artpocalypse" isn't coming, in part because of AI backlash in general, but more specifically because people who are willing to pay money for art seem to strongly prefer that their money goes to an artist, not a GPU cluster operator.

I think a similar idea might be persisting in AI programming as well, even though it seems like such a perfect use case. Anthropic released an internal survey a few weeks ago that was like, the vast majority, something like 90% of their own workers AI usage, was spent explaining allnd learning about things that already exist, or doing little one-off side projects that otherwise wouldn't have happened at all, because of the overhead, like building little dashboards for a single dataset or something, stuff where the outcome isn't worth the effort of doing it yourself. For everything that actually matters and would be paid for, the premier AI coding company is using people to do it.

kurthr 5 hours ago | parent | next [-]

I guess I'm in a bubble, because it doesn't feel that way to me.

When AI tops the charts (in country music) and digital visual artists have to basically film themselves working to prove that they're actually creating their art, it's already gone pretty far. It feels like the even when people care (and they great mass do not) it creates problems for real artists. Maybe they will shift to some other forms of art that aren't so easily generated, or maybe they'll all just do "clean up" on generated pieces and fake brush sequences. I'd hate for art to become just tracing the outlines of something made by something else.

Of course, one could say the same about photography where the art is entirely in choosing the place, time, and exposure. Even that has taken a hit with believable photorealistic generators. Even if you can detect a generator, it spoils the field and creates suspicion rather than wonder.

clickety_clack 2 hours ago | parent | prev | next [-]

Art is political more than it is technical. People like Banksy’s art because it’s Banksy, not because he creates accurate images of policemen and girls with balloons.

majormajor 2 hours ago | parent [-]

I think "cultural" is a better word there than "political."

But Banksy wasn't originally Banksy.

I would imagine that you'll see some new heavily-AI-using artists pop up and become name brands in the next decade. (One wildcard here could be if the super-wealthy art-speculation bubble ever pops.)

Flickr, etc, didn't stop new photographers from having exhibitions and being part of the regular "art world" so I expect the easy availability of slop-level generated images similarly won't change that some people will do it in a way that makes them in-demand and popular at the high end.

At the low-to-medium end there are already very few "working artists" because of a steady decline after the spread of recorded media.

Advertising is an area where working artists will be hit hard but is also a field where the "serious" art world generally doesn't consider it art in the first place.

musicale 3 hours ago | parent | prev | next [-]

> people who are willing to pay money for art seem to strongly prefer that their money goes to an artist, not a GPU cluster operator

Businesses which don't want to pay money strongly prefer AI.

sureglymop 3 hours ago | parent | next [-]

Yeah but if they, for example use AI to do their design or marketing materials then the public seems to dislike that. But again, no numbers that's just how it feels to me.

heavyset_go 2 hours ago | parent | prev [-]

Then they get a product that legally isn't theirs and anyone can do anything with it. AI output isn't anyone's IP, it can't be copyrighted.

smj-edison 4 hours ago | parent | prev | next [-]

I'd distinguish between physical art and digital art tbh. Physical art has already grappled with being automated away with the advent of photography, but people still buy physical art because they like the physical medium and want to support the creator. Digital art (for one off needs), however, is a trickier place since I think that's where AI is displacing. It's not making masterpieces, but if someone wanted a picture of a dwarf for a D&D campaign, they'd probably generate it instead of contracting it out.

crooked-v an hour ago | parent | prev [-]

> more specifically because people who are willing to pay money for art seem to strongly prefer that their money goes to an artist, not a GPU cluster operator.

Look at furniture. People will pay a premium for handcrafted furniture because it becomes part of the story of the result, even when Ikea offers a basically identical piece (with their various solid-wood items) at a fraction of the price and with a much easier delivery process.

Of course, AI art also has the issue that it's effectively impossible to actually dictate details exactly like you want. I've used it for no-profit hobby things (wargames and tabletop games, for example), and getting exact details for anything (think "fantasy character profile using X extensive list of gear in Y specific visual style") takes extensive experimentation (most of which can't be generalized well since it depends on quirks of individual models and sub-models) and photoshopping different results together. If I were doing it for a paid product, just commissioning art would probably be cheaper overall compared to the person-hours involved.

patcon 40 minutes ago | parent | prev | next [-]

> AND are alienating the creators of these cultural outputs via displacement of labor and payment

YES. Thank you for these words. It's a form of ecological collapse. Thought to be fair, the creative ecology has always operated at the margins.

But it's a form of library for challenges in the world, like how a rainforest is an archive of genetic diversity, with countless application like antibiotics. If we destroy it, we lose access to the library, to the archive, just as the world is getting even more treacherous and unstable and is in need of creativity

vkou 2 hours ago | parent | prev [-]

> So either training data is going to get more expensive because you're going to have to pay creators, or these models will slowly drift away from the contemporary cultural reality.

Nah, more likely is that contemporary cultural reality will just shift to accept the output of the models and we'll all be worse off. (Except for the people selling the models, they'll be better off.)

You'll be eating nothing but the cultural equivalent of junk food, because that's all you'll be able to afford. (Not because you don't have the money, but because artists can't afford to eat.)

BinaryIgor 6 hours ago | parent | prev | next [-]

Yes! One could argue that we might end up with programmers (experts) going through a training of creating software manually first, before becoming operators of AI, and then also spending regularly some of their working time (10 - 20%?) on keeping these skills sharp - by working on purely education projects, in the old school way; but it begs the question:

Does it then really speeds us up and generally makes things better?

andoando 5 hours ago | parent [-]

This is a pedantic point no longer worth fighting for but "begs the question" means something is a circular argument, and not "this raises the question"

https://en.wikipedia.org/wiki/Begging_the_question

agumonkey an hour ago | parent | prev | next [-]

I kinda fear that this is an economic plane stall, we're tilting upward so much, the underlying conditions are about to dissolve

And I'd add, that recent LLMs magic (i admit they reached a maturity level that is hard to deny) is also a two edged sword, they don't create abstraction often, they create a very well made set of byproducts (code, conf, docs, else) to realize your demand, but people right now don't need to create new improved methods, frameworks, paradigms because the LLM doesn't have our mental constraints.. (maybe later reasoning LLMs will tackle that, plausibly)

frabonacci 4 hours ago | parent | prev | next [-]

The author's conclusion feels even more relevant today: AI automation doesn’t really remove human difficulty—it just moves it around, often making it harder to notice and more risky. And even after a human steps in, there’s usually a lot of follow-up and adjustment work left to do. Thanks for surfacing these uncomfortable but relevant insights

bitwize 3 hours ago | parent [-]

Sanchez's Law of Abstraction comes to mind: https://news.ycombinator.com/item?id=22601623

fuzzfactor 3 hours ago | parent | prev | next [-]

>skills, which later generations of operators cannot be expected to have.

You can't ring more true than this. For decades now.

For a couple years there I was able to get some ML together and it helped me get my job done, never came close to AI, I only had kilobytes of memory anyway.

By the time 1983 rolled around I could see the writing on the wall, AI was going to take over a good share of automation tasks in a more intelligent way by bumping the expert systems up a notch. Sometimes this is going to be a quantum notch and it could end up like "expertise squared" or "productivity squared" [0]. At the rarefied upper bound. Using programmable electronics to multiply the abilities of the true expert whilst simultaneously the expert utilized their abilities to multiply the effectiveness of the electronics. Maybe only reaching the apex when the most experienced domain expert does the programming, or at least runs the show.

Never did see that paper, but it was obvious to many.

I probably mentioned this before, but that's when I really bucked down for a lifetime of experimental natural science across a very broad range of areas which would be more & more suitable for automation. While operating professionally within a very narrow niche where personal participation would remain the source of truth long enough for compounding to occur. I had already been a strong automation pioneer in my own environment.

So I was always fine regardless of the overall automation landscape, and spent the necessary decades across thousands of surprising edge cases getting an idea how I would make it possible for someone else to even accomplish some of these difficult objectives, or perhaps one day fully automate. If the machine intelligence ever got good enough. Along with the other electronics, which is one of the areas I was concentrating on.

One of the key strategies did turn out to be outliving those who had extensive troves of their own findings, but I really have not automated that much. As my experience level becomes less common, people seem to want me to perform in person with greater desire every decade :\

There's related concepts for that too, some more intelligent than others ;)

[0] With a timely nod to a college room mate who coined the term "bullshit squared"

naveen99 an hour ago | parent | prev | next [-]

I mean how did you get an expert programmer before ? Surely it can’t be harder to learn to program with ai than without ai. It’s written in the book of resnet.

You could swap out ai with google or stackoverflow or documentation or unix…

Legend2440 3 hours ago | parent | prev | next [-]

>the present generation of automated systems, which are monitored by former manual operators, are riding on their skills, which later generations of operators cannot be expected to have.

But we are in the later generation now. All the 1983 operators are now retired, and today's factory operators have never had the experience of 'doing it by hand'.

Operators still have skills, but it's 'what to do when the machine fails' rather than 'how to operate fully manually'. Many systems cannot be operated fully manually under any conditions.

And yet they're still doing great. Factory automation has been wildly successful and is responsible for why manufactured goods are so plentiful and inexpensive today.

gmueckl 3 hours ago | parent [-]

It's not so simple. The knowledge hasn't been transferred to future operators, but to process engineers who are kow in charge of making the processes work reliably through even more advanced automation that requires more complex skills and technology to develop and produce.

Legend2440 3 hours ago | parent [-]

No doubt, there are people that still have knowledge of how the system works.

But operator inexperience didn't turn out to be a substantial barrier to automation, and they were still able to achieve the end goal of producing more things at lower cost.

startupsfail 7 hours ago | parent | prev [-]

The same argument was there about needing to be an expert programmer in assembly language to use C, and then same for C and Python, and then Python and CUDA, and then Theano/Tensorflow/Pytorch.

And yet here we are, able to talk to a computer, that writes Pytorch code that orchestrates the complexity below it. And even talks back coherently sometimes.

gipp 7 hours ago | parent | next [-]

Those are completely deterministic systems, of bounded scope. They can be ~completely solved, in the sense that all possible inputs fall within the understood and always correctly handled bounds of the system's specifications.

There's no need for ongoing, consistent human verification at runtime. Any problems with the implementation can wait for a skilled human to do whatever research is necessary to develop the specific system understanding needed to fix it. This is really not a valid comparison.

wasabi991011 7 hours ago | parent | prev | next [-]

No, that is a terrible analogy. High level languages are deterministic, fully specified, non-leaky abstractions. You can write C and know for a fact what you are instructing the computer to do. This is not true for LLMs.

ben_w 6 hours ago | parent [-]

I was going to start this with "C's fine, but consider more broadly: one reason I dislike reactive programming is that the magic doesn't work reliably and the plumbing is harder to read than doing it all manually", but then I realised:

While one can in principle learn C as well as you say, in practice there's loads of cases of people getting surprised by undefined behaviour and all the famous classes of bug that C has.

layer8 2 hours ago | parent | next [-]

There is still the important difference that you can reason with precision about a C implementation’s behavior, based on the C standard and the compiler and library documentation, or its source or machine code when needed. You can’t do that type of reasoning for LLMs, or only to a very limited extent.

Bootvis 5 hours ago | parent | prev [-]

Maybe, but buffer overflows would occur written in assembler written by experts as well. C is a fine portable assembler (could probably be better with the knowledge we have now) but programming is hard. My point: you can roughly expect an expert C programmer to produce as many bugs per unit of functionality as an expert assembly programmer.

I believe it to be likely that the C programmer would even writes the code faster and better because of the useful abstractions. An LLM will certainly write the code faster but it will contain more bugs (IME).

the_snooze 6 hours ago | parent | prev [-]

>And yet here we are, able to talk to a computer, that writes Pytorch code that orchestrates the complexity below it.

It writes something that that's almost, but not quite entirely unlike Pytorch. You're putting a little too much value on a simulacrum of a programmer.

didibus an hour ago | parent | prev | next [-]

A good read, but it reminds me that people see the programmer as being there to identify when the AI makes an error or a mistake.

But in my use of AI agents as a programmer and also for other work. I would say that, while yes, you also have to look for mistakes or errors, most of the time I spend is on programming the AI still.

The AI agent has no idea what it must produce, what it's meant to do, when it can alter something existing to enable something new, etc.

And this is true for both functional and non-functional requirements.

Unlike in traditional manufacturing, you've already built your manufacturing pipeline for a precise output, you've got your CAD designs done, you ran your simulations, you've calibrated everything already for what you want.

So most of the work remains that of programming the machine.

z_ 9 hours ago | parent | prev | next [-]

This is a thought provoking piece.

“But at what cost?”

We’ve all accepted calculators into our lives as being faster and correct when utilized correctly (Minus Intel tomfoolery), but we emphasize the need to know how to do the math in educational settings.

Any post education adult will confirm when confronted with an irregular math problem (or a skill) that there is a wait time to revive the ability.

Programming automation having the potential skill decay AND being critical path is … worth thinking about.

xorcist 8 hours ago | parent | next [-]

Comparisons with deterministic tools such as calculators will always lead astray. There is no comparable situation where faced with a new problem the AI will just give up. If there is the need for an expert, the need is always there, because there is no indication external to the process that the process will fail.

singpolyma3 7 hours ago | parent | prev | next [-]

Calculators don't do math, they do calculating. Which is to say, they don't think for you. There's not much value in being able to quickly compute some expression in a world with calculators. But there's a huge value in knowing how to know which numbers to feed into the calculation.

kurthr 5 hours ago | parent | next [-]

The biggest problem with calculators (rather than slide rules), was that because calculations with big numbers (large mantissa) were so easy, people got used to doing them that way without consideration.

Using a slide rule meant inherently knowing order-of-magnitude, rounding, and precision. Once calculators make it easy they enable both new kinds of solutions and new kinds of errors (that you have to separately teach to avoid).

At the same time, I basically agree. Humans are very bad calculators and we've needed tools (abacus) for millennia.

bitwize 3 hours ago | parent | prev [-]

I derive tremendous value from being able to calculate taxes, tips, and so forth in my head, or right on the receipt, without having to reach for my phone and launch Droid48. (I know some of y'all are also Droid48 bros.) It's even more profound a convenience than knowing how to drive Emacs with just the keyboard and not having to reach for the goddamn mouse.

eastbound 7 hours ago | parent | prev [-]

We already have generational programming decay. At 25 years old, kids fresh out of uni can’t write a string.contains() routine. They all use .stream() in Java. Matter of generation, fashion and skills to learn. And concerning the programming of C drivers, Apple is the last company to write a filesystem and they already can’t find anyone able to do it.

nuancebydefault 9 hours ago | parent | prev | next [-]

The article discusses basically 2 new problems with using agentic AI:

- When one of the agents does something wrong, a human operator needs to be able to intervene quickly and needs to provide the agent with expert instructions. However since experts do not execute the bare tasks anymore, they forget parts of their expertise quickly. This means the experts need constant training, hence they will have little time left to oversee the agent's work.

- Experts must become managers of agentic systems, a role which they are not familiar with, hence they are not feeling at home in their job. This problem is harder to be determined as a problem by people managers (of the experts) since they don't experience that problem often first hand.

Indeed the irony is that AI provides efficiency gains, which as they become more widely adopted, become more problematic because they outfit the necessary human in the loop.

I think this all means that automation is not taking away everyone's job, as it makes things more complicated and hence humans can still compete.

asielen 8 hours ago | parent | next [-]

The way you put that makes be think of the current challenge younger generations are having with technology in general. Kids who were raised on touch screen interfaces vs kids in older generations who were raised on computers that required more technical skill to figure out.

In the same way, when everything just works, there will be no difference, but when something goes wrong, the person who learned the skills before will have a distinct advantage.

The question is if AI gets good enough that slowing down occasionally to find a specialist is tenable. It doesn't need to be perfect, it just needs to be predicably not perfect.

Expertw will always be needed, but they may be more like car mechanics, there to fix hopefully rare issues and provide a tune up, rather than building the cars themselves.

jeffreygoesto 8 hours ago | parent [-]

Car mechanics face the same problem today with rare issues. They know the mechanical standard procedures and that they can not track down a problem but only try to flash over an ECU or try swapping it. They also don't admit they are wrong, at least most of the time...

c0balt 2 hours ago | parent [-]

> only try to flash over an ECU or try swapping it.

To be fair, they have wrenches thrown in their way there as many ECUs and other computer-driven components are fairly locked down and undocumented. Especially as the programming software itself is not often freely distributed (only for approved shops/dealers).

grvdrm 7 hours ago | parent | prev | next [-]

Your first problem doesn’t feel new at all. Reminded me of a situation several years ago. What was previous Excel report was automated into PowerBI. Great right? Time saved. Etc.

But the report was very wrong for months. Maybe longer. And since it was automated, the instinct to check and validate was gone. And tracking down the problem required extra work that hadn’t been part of the Excel flow

I use this example in all of my automation conversations to remind people to be thoughtful about where and when they automate.

all2 3 hours ago | parent [-]

Thoughtfulness is sometimes increased by touch time. I've seen various examples of this over time; teachers who must collate and calculate grades manually showed improved outcomes for their students, test techs who handle hardware becoming acutely aware of the many failure modes of the hardware, and so on.

delaminator 8 hours ago | parent | prev | next [-]

I used to be a maintenance data analyst in a welding plant welding about 1 million units per month.

I was the only person in the factory who was a qualified welder.

layer8 2 hours ago | parent | prev | next [-]

They also made the point that the less frequent failures become, the more tedious it is for the human operator to check for them, giving the example of AI agents providing verbose plans of what they intend to do that are mostly fine, but will occasionally contain critical failures that the operator is supposed to catch.

DiscourseFan 9 hours ago | parent | prev [-]

That's how it tends to go, automation removes some parts of the work but creates more complexity. Sooner or later that will also be automated away, and so on and so forth. AGI evangelists ought to read Marx's Capital.

jennyholzer2 7 hours ago | parent [-]

I seriously doubt that there is even one "AGI evangelist" who has the intellectual capacity to read books written for adult audiences.

bitwize 3 hours ago | parent | next [-]

Marxists have the tendency to think that the Venn diagram of "people who have read and understand Marx" and "Marxists" is a circle. There are plenty of AGI evangelists who are smart enough to read Marx, and many of them probably have. The problem is that, being technolibertarians and that, they think Marx is the enemy.

DiscourseFan an hour ago | parent [-]

That seems patently absurd, considering that the debate is not between marxists and non-marxists but accelerationists and orthodox marxists, who are both readers of marx, its just that the former is in alignment with technolibertarianism.

ctoth 6 hours ago | parent | prev [-]

Hi. I am not an evangelist -- I'm quite certain it's going to kill us all! But I would like to think that I'm about the closest thing to an AI booster you might find here, given that I get so much damn utility out of it. I'm interested in reading, I probably read too much! would you like to suggest a book we can discuss next week? I'd be happy to do this with you.

jiehong 6 hours ago | parent | prev | next [-]

This irony of automation has been dealt with in the aviation industry for pilot for years: auto pilots can actually land the plane in many cases, and do fly the plane on most of the cruise.

Yet, pilots are constantly trained on actual scenarios, and are expected to land airplanes manually monthly (and during take off too).

This ensures pilots maintain their skills, while the auto pilot helps most of the time.

On top of that, plane commands often are half automatic already, aka they are assisted (but not by LLMs!), so it’s a complex comparison.

libraryofbabel 6 hours ago | parent [-]

Yes, but (to write the second half of your post for you!) regulation and incentives are very different in the aviation industry, because safety and planning for long-tail risks is paramount. Therefore airlines can afford to have their pilots spend thousands of hours training on manual control in various scenarios. By contrast, I don’t think the average software development org will encourage its engineers to hand-roll a sizable proportion of their code, if (still a big if) there are major productivity costs in doing so. Rushing the Next Big Feature out the door will almost always beat out long-term investment in dev training, unfortunately.

Don’t get me wrong - manual practice is in some sense the correct solution, and I plan to try and do it myself in the next decade to make sure my skills stay sharp. But I don’t see the industry broadly encouraging it, still less making it mandatory as aviation does.

Addendum: as you probably know, even in aviation, this is hard to get right. (This is sometimes called the “children of the magenta” problem, but it’s really Bainbridge again.) The most famous example is perhaps Air France Flight 447[0], where the pilots put the plane into a stall at 35,000ft when they reacted poorly after the autopilot disconnecting, and did not even realize they had stalled the plane. Of course, that crash itself led to more regulations around training in manual scenarios too.

[0] https://admiralcloudberg.medium.com/the-long-way-down-the-cr...

justincormack 3 hours ago | parent [-]

In most industry now you can't make the things by hand any more, there is no fallback. Once things get designed for automation there is no way back.

dsjoerg 3 hours ago | parent | prev | next [-]

> Typically, before people are put in a leadership role directing humans, they will get a lot of leadership training teaching them the skills and tools needed to lead successfully.

I question this.

everdrive 8 hours ago | parent | prev | next [-]

I can feel the skill atrophy creeping in. My very first instinct is go use the LLM. I think much like forcing yourself to exercise, eat right, and avoid social media / distractions, this will be a new modern skillset; do you have the discipline to avoid becoming useless without an LLM? A small few will be great at this, the middle of the bell curve will do "well enough," and you know the story for the rest.

andy99 8 hours ago | parent | next [-]

I’ve been using LLMs to code for some time and I look at it differently.

I ask myself if I need to understand the code, and if the answer is yes I don’t use an LLM. It’s not a matter of discipline, it’s a sober view of what the minimal amount of work for me is.

layer8 2 hours ago | parent [-]

The only time one doesn’t need to understand the code is when it doesn’t matter if the code is correct, or when it can be tested exhaustively for all possible inputs. Both are pretty rare for me.

delaminator 7 hours ago | parent | prev | next [-]

I haven't written any code in 6 months. But I can still remember how to code in 6502 machine code from the 1980s.

zeroonetwothree 7 hours ago | parent [-]

How can you be sure you remember if you aren’t actually doing it?

kaffekaka 3 hours ago | parent [-]

This is an important question I think. Gradually losing a skill to atrophy is not something you notice consciously.

delaminator 2 hours ago | parent [-]

Come on, I've been coding for 45 years. I don't forget so quickly.

vips7L an hour ago | parent | prev [-]

This just sounds like addiction to the dopamine of instant gratification.

sublimefire 9 hours ago | parent | prev | next [-]

Good discussion of the paper and the observations and ironies. A thing to note is that we do have software factories already, with a bunch of automation in place and folks being trained to deal with incidents. The pools of agents just elevate what we currently have but the tools are still lacking severely. IMO the tools need to improve for us to move forward as it is difficult to observe the decisions of agents when they fall apart.

Also, by and large the current AI tools are not in the critical path yet, well except those drones that lock on targets to eliminate them in case of interference, and even then it is ML. Agents can not be in that path due to predictability challenges yet.

steveBK123 6 hours ago | parent | prev | next [-]

I think for most non-coding tasks we are still in the "convincing liar" stage, and not even at the "its right 99.9% of the time and humans need to quickly detect the 0.1% errors" problem. I think a lot of the HN crowd misses this because they are programmers using it for programming.

I work at a firm that has given AI tooling to non-developer data analyst type people who otherwise live & die in excel. Much of their day job involves reading PDFs. I occasionally will use some of the firms AI tooling for PDF summarizing/parsing/interrogation/etc type tasks and remain consistently underwhelmed.

Stuff like taking 10 PDFs each with a simple 30 row table per PDF, with the same title in each file, it ends up puking on 3-4 out of 10 with silent failures. Row drops, duplicating data, etc. When you point out its missed rows, it goes back and duplicates rows to get to the correct row count.

Using it to interrogate standard company filings PDfs that it has been specially trained on and it gave very convincing answers which were wrong because it has silently truncated its search context to only recent year financial filings. Nowhere did it show this limitation to the user. It only became apparent after researching the 4th or 5th company when it decided to caveat its answer with its knowledge window. This invalidated the previous answers as questions such as "when was the first X" or "have they ever reported Y" were operating on incomplete information.

Most users of these tool are not that technical, and are going to be much more naive in taking the answers for fact without considering the context.

Terr_ 5 hours ago | parent [-]

I'm convinced the best use of these systems will be an explicit two-phase process where they just help people prototype and see and learn how to command regular software.

For example, imagine describing what files you want to find, and getting back a command-line string of find/grep piping. It doesn't execute anything without confirmation, it doesn't "summarize" the results, it's just a narrow tutor to help people in a translation step. A tool for learning that, ideally, eventually puts itself out of a job.

Returning to your PDF scenario: The LLM could help people weave together regular tools of "find regions with keywords" and "extract table as spreadsheet" and "cross-reference two spreadsheets using column values", etc.

demorro 5 hours ago | parent | prev | next [-]

These observations were made 40 years ago. I suspect we have solved many of these problems now and have close to fully automated manufacturing and flight systems, or close enough that the training trade-off is worth it.

However, this took 40 years and actual fatalities. We should keep that in mind when we're pushing the AI acceleration pedal down ever harder.

jinwoo68 7 hours ago | parent | prev | next [-]

"Most companies are efficiency-obsessed."

But what most of them do is not to be more efficient but to be shown to be more efficient. The main reason they are so obsessed with AI is because they want to send the signal that they are pursuing to be more efficient, whether they succeed or not.

theologic 6 hours ago | parent [-]

Peter Drucker popularized the phrase "Efficiency is doing things right; effectiveness is doing the right things."

Being a credibly efficient at doing the wrong things, turns out to be a massive issue inside of most companies. What's interesting is I do think that AI gives opportunity to be massively more effective because if you have the right LLM, that's trained right, you can explore a variety of scenarios much faster than what you can do by yourself. However, we hear very little about this as a central thrust of how to utilize AI into the work space.

jennyholzer2 9 hours ago | parent | prev | next [-]

"Most companies are efficiency-obsessed. Hence, they also expect AI solutions to increase “productivity”, i.e., efficiency, to a superhuman level. If a human is meant to monitor the output of the AI and intervene if needed, this requires that the human needs to comprehend what the AI solution produced at superhuman speed – otherwise we are down to human speed. This presents a quandary that can only be solved if we enable the human to comprehend the AI output at superhuman speed (compared to producing the same output by traditional means)."

everdrive 9 hours ago | parent | next [-]

> "Most companies are efficiency-obsessed. Hence, they also expect AI solutions to increase “productivity”

So this is true on paper, but I can tell you that companies don't broadly do a very good job of being efficient. What they do a good job of is doing the bare minimum in a number of situations, generating fragile, messy, annoying, or tech-debt-ridden systems / processes / etc.

Companies regularly claim to make objective and efficient decisions, but often those decisions amount to little more than doing a half-assed job because it will save money and will probably be good enough. The "probably" does a lot of work here, and then "probably" is not good enough there's a lot of blame shifting / politics / bullshitting.

The idea that companies are efficient is generally not very realistic except when it comes to things with real, measurable costs, such as manufacturing.

conception 8 hours ago | parent | next [-]

I think it’s more that companies can want to be efficient but most people prefer the status quo to change on just about any work task if it requires any relearning or training effort.

SecretDreams 8 hours ago | parent | prev [-]

> What they do a good job of is doing the bare minimum in a number of situations, generating fragile, messy, annoying, or tech-debt-ridden systems / processes / etc.

Is that not efficiency? ~ some managers I know

TheOtherHobbes 9 hours ago | parent | prev | next [-]

Not necessarily. It depends if the process is deterministic and repeatable.

If an AI generates a process more quickly than a human, and the process can be run deterministically, and the outputs are testable, then the process can run without direct human supervision after initial testing - which is how most automated processes work.

The testing should happen anyway, so any speed increase in process generation is a productivity gain.

Human monitoring only matters if the AI is continually improvising new solutions to dynamic problems and the solutions are significantly wrong/unreliable.

Which is a management/analysis problem, and no different in principle to managing a team.

The key difference in practice is that you can hire and fire people on a team, you can intervene to change goals and culture, and you can rearrange roles.

With an agentic workflow you can change the prompts, use different models, and redesign the flow. But your choices are more constrained.

lkjdsklf 8 hours ago | parent [-]

The issue is LLMs are, by design, non-deterministic.

That means that, with the current technology, there can never be a deterministic agent.

Now obviously, humans aren't deterministic either, but the error bars are a lot closer together than they are with LLMs these days.

An easy to point at example is the coding agent that removed someones home directory that was circulating around. I'm not saying a human has never done that, but it's far less likely because it's so far out of the realm of normal operations.

So as of today, we need humans in the loop. And this is understood by the people making these products. That's why they have all these permissions and prompts for you to accept/run commands and all of that.

1718627440 8 hours ago | parent | next [-]

> An easy to point at example is the coding agent that removed someones home directory that was circulating around. I'm not saying a human has never done that, but it's far less likely because it's so far out of the realm of normal operations.

And it would be far less likely that the human deleted someone else's home directory, and even if he did, there would be someone to be angry about.

ctoth 6 hours ago | parent | prev | next [-]

The viral post going around? The one where the author's own root cause analysis says "Human Error"[0]?

What's the base rate of humans rm -rf'ing their own work?

[0] https://blog.toolprint.ai/p/i-asked-claude-to-wipe-my-laptop

lkjdsklf 6 hours ago | parent [-]

If you read hte post, he didn't ask it to delete his home directory. He misread the command it generated and approved it when he shouldn't have.

That's literally exactly the kind of non-determinism I'm talking about. If he'd just left the agent to it's own devices, the exact same thing would have happened.

now you may argue this highlights that people make catastrophic mistakes too, but I'm not sure i agree.

Or at least, they don't often make that kind of mistake. Not saying that they don't make any catastrophic mistakes (they obviously do....)

We know people tend to click "accept" on these kinds of permission prompts with only a cursory read of what it's doing. And the more of these prompts you get, the more likely you are to just click "yes" or whatever to get through it..

If anything this kind of perfectly highlights some of the ironies referenced in the post itself.

loa_in_ 8 hours ago | parent | prev [-]

There's lots of _marketing_ promising unsupervised agents. It's important to remember not to drink the cool-aid.

singpolyma3 7 hours ago | parent | prev | next [-]

Superhuman can mean different things though. Most software developers in industry are very very slow and so superhuman, for them, may still be less than what is humanly achievable for someone else. It's not a binary situation

sokoloff 7 hours ago | parent | prev [-]

Being down to human speed of reviewing code that already passes tests could still be a massive increase over 12 months’ ago pace.

throwaway613745 6 hours ago | parent | prev | next [-]

If your process is shit, you're just automating shit at lightning speed.

If you're bad at your job, you're automating it at lightning speed.

You need have good business process and be good at your job without AI in order to have any chance in hell of being successful with it. The idea that you can just outsource your thinking to the AI and don't need to actually understand or learn anything new anymore is complete delusion.

analog8374 3 hours ago | parent | prev | next [-]

I spent years creating automated drawing machines. But I can still draw better than any of them with my hand. Not as quickly tho.

wesammikhail 9 hours ago | parent | prev [-]

Our of curiosity, does anyone know of a good writeup / blog post made by someone in the industry that revolves around reducing orchestration error rates? Would love to read some more about the topic and I'm looking for a few good resources.