Remix.run Logo
vladsh 2 days ago

It’s a bit strange how anecdotes have become acceptable fuel for 1000 comment technical debates.

I’ve always liked the quote that sufficiently advanced tech looks like magic, but its mistake to assume that things that look like magic also share other properties of magic. They don’t.

Software engineering spans over several distinct skills: forming logical plans, encoding them in machine executable form(coding), making them readable and expandable by other humans(to scale engineering), and constantly navigating tradeoffs like performance, maintainability and org constraints as requirements evolve.

LLMs are very good at some of these, especially instruction following within well known methodologies. That’s real progress, and it will be productized sooner than later, having concrete usecases, ROI and clearly defined end user.

Yet, I’d love to see less discussion driven by anecdotes and more discussion about productizing these tools, where they work, usage methodologies, missing tooling, KPIs for specific usecases. And don’t get me started on current evaluation frameworks, they become increasingly irrelevant once models are good enough at instruction following.

scosman 2 days ago | parent | next [-]

> It’s a bit strange how anecdotes have become acceptable fuel for 1000 comment technical debates.

Progress is so fast right now anecdotes are sometimes more interesting than proper benchmarks. "Wow it can do impressive thing X" is more interesting to me than a 4% gain on SWE Verified Bench.

In early days of a startup "this one user is spending 50 hours/week in our tool" is sometimes more interesting than global metrics like average time in app. In the early/fast days, the potential is more interesting than the current state. There's work to be done to make that one user's experience apply to everyone, but knowing that it can work is still a huge milestone.

elfly a day ago | parent [-]

At this point I believe the anecdotes more than benchmarks, cause I know the LLM devs train the damn things on the benchmarks.

A benchmark? probably was gamed. A guy made an app to right click and convert an image? prolly true, have to assume it may have a lot of issues but prima facie I just make a mental note that this is possible now.

flumpcakes 2 days ago | parent | prev | next [-]

> It’s a bit strange how anecdotes have become acceptable fuel for 1000 comment technical debates.

It's a very subjective topic. Some people claim it increases their productivity 100x. Some think it is not fit for purpose. Some think it is dangerous. Some think it's unethical.

Weirdly those could all be true at the same time, and where you land on this is purely a matter of importance to the user.

> Yet, I’d love to see less discussion driven by anecdotes and more discussion about productizing these tools, where they work, usage methodologies, missing tooling, KPIs for specific usecases. And don’t get me started on current evaluation frameworks, they become increasingly irrelevant once models are good enough at instruction following.

I agree. I've said earlier that I just want these AI companies to release an 8-hour video of one person using these tools to build something extremely challenging. Start to finish. How do they use it, how does the tool really work. What's the best approaches. I am not interested in 5-minute demo videos producing react fluff or any other boiler plate machine.

I think the open secret is that these 'models' are not much faster than a truly competent engineer. And what's dangerous is that it is empowering people to 'write' software they don't understand. We're starting to see the AI companies reflect this in their marketing, saying tech debt is a good thing if you move fast enough....

This must be why my 8-core corporate PC can barely run teams and a web browser in 2026.

weitendorf 2 days ago | parent [-]

How many 1+ hour videos of someone building with AI tools have you sought out and watched? Those definitely exist, it sounds like you didn't go seeking them out or watch them because even with 7 less hours you'd better understand where they add value enough to believe they can help with challenging projects.

So why should anybody produce an 8 hour video for you when you wouldn't watch it? Let's be real. You would not watch that video.

In my opinion most of the people who refuse to believe AI can help them while work with software are just incurious/archetypical late adopters.

If you've ever interacted with these kinds of users, even though they might ask for specs/more resources/more demos and case studies or maturity or whatever, you know that really they are just change-resistant and will probably continue to be as as long as they can get away with it being framed as skepticism rather than simply being out of touch.

I don't mean that in a moralizing sense btw - I think it is a natural part of aging and gaining experience, shifting priorities, being burned too many times. A lot of business owners 30 years ago probably truly didn't need to "learn that email thing", because learning it would have required more of a time investment than it would yield, due to being later in their career with less time for it to payoff, and having already built skills/habits/processes around physical mail that would become obsolete with virtual mail. But a lot of them did end up learning that email thing 5, 10, whatever years later when the benefits were more obvious and the rest of the world had already reoriented itself around email. Even if they still didn't want to, they'd risk looking like a fossil/"too old" to adapt to changes in the workplace if they didn't just do it.

That's why you're seeing so many directors/middle managers doing all these though leader posts about AI recently. Lots of these guys 1-2 years ago were either saying AI is spicy autocomplete or "our OKR this quarter is to Do AI Things". Now they can't get away with phoning it in anymore and need to prove to their boss that they are capable of understanding and using AI, the same way they had to prove that they understood cloud by writing about kubernetes or microservices or whatever 5-10 years ago.

forgotaccount3 2 days ago | parent | next [-]

> In my opinion most of the people who refuse to believe AI can help them while work with software are just incurious/archetypical late adopters.

The biggest blocker I see to having AI help us be more productive is that it transforms how the day to day operations work.

Right now there is some balance in the pipeline of receiving change requests/enhancements, documenting them, estimating implementation time, analyzing cost and benefits, breaking out the feature into discrete stories, having the teams review the stories and 'vote' on a point sizing, planning on when each feature should be completed given the teams current capacity and committing to the releases (PI Planning), and then actually implementing the changes being requested.

However if I can take a code base and enter in a high level feature request from the stakeholders and then hold hands with Kiro to produce a functioning implementation in a day, then the majority of those steps above are just wasting time. Spending a few hundred man-hours to prepare for work that takes a few hundred man-hours might be reasonable, but doing that same prep work for a task that takes 8 man-hours isn't.

And we can't shift to that faster workflow without significant changes to entire software pipeline. The entire PMO team dedicated to reporting when things will be done shifts if that 'thing' is done before the report to the PMO lead is finished being created. Or we need significantly more resources dedicated to planning enhancements so that we could have an actual backlog of work for the developers. But my company appears to neither be interested in shrinking the PMO team nor in expanding the intake staff.

mossTechnician 2 days ago | parent | prev | next [-]

It could be really beneficial for Anthropic to showcase how they use their own product; since they're developers already, they're probably dogfooding their product, and the effort required should be minimal.

- A lot of skeptics have complained that AI companies aren't specific about how they use their products, and this would be a great example of specificity.

- It could serve as a tutorial for people who are unfamiliar with coding agents.

- The video might not convince people who have already made up their minds, but at least you could point to it as a primary source of information.

weitendorf 2 days ago | parent [-]

These exist. Just now I triedfinding such a video for a medium-sized contemporary AI devtools product (Mastra) and it took me only a few seconds to arrive at https://www.youtube.com/watch?v=fWmSWSg848Q

There could be a million of these videos and it wouldn't matter, the problem is incuriosity/resistance/change-aversion. It's why so many people write comments complaining about these videos not existing without spending even a single minute looking for them: they wouldn't watch these videos even if they existed. In fact, they assume/assert they don't exist without even looking for them because they don't want them to exist: it's their excuse for not doing something they don't want to do.

flumpcakes a day ago | parent [-]

That video was completely useless for me. I didn't see a single thing I would consider programming. I don't want to waste time building workflows or agentic agents, I want to see them being used to solve real world difficult problems from start to finish.

D-Machine a day ago | parent [-]

I have to agree, this video is hardly what most people would mean by programming. I am sure there are better videos than this?

flumpcakes a day ago | parent | prev | next [-]

> How many 1+ hour videos of someone building with AI tools have you sought out and watched?

A lot, they've mostly all been advertising trite and completely useless.

I don't want a demonstration of what a jet-powered hammer is by the sales person or how to oil it, or mindless fluff about how much time it will save me hammering things. I want to see a journeyman use a jet-powered hammer to build a log cabin.

I am personally not seeing this magic utopia. No one wants to show me it, they just want to talk about how better it is.

andai 2 days ago | parent | prev [-]

The honest answer is that I would probably ask AI to analyze the video for me, and that it would probably do a pretty good job.

pksebben 2 days ago | parent | prev | next [-]

I can only speak for myself, but it feels like playing with fire to productize this stuff too quick.

Like, I woke up one day and a magical owl told me that I was a wizard. Now I control the elements with a flick of my wrist - which I love. I can weave the ether into databases, apps, scripts, tools, all by chanting a simple magical invocation. I create and destroy with a subtle murmur.

Do I want to share that power? Naturally, it would be lonely to hoard it and despite the troubles at the Unseen University, I think that schools of wizards sharing incantations can be a powerful good. But do I want to share it with everybody? That feels dangerous.

It's like the early internet - having a technical shelf to climb up before you can use the thing creates a kind of natural filter for at least the kinds of people that care enough to think about what they're doing and why. Selecting for curiosity at the very least.

That said, I'm also interested in more data from an engineering perspective. It's not a simple thing and my mind is very much straddling the crevasse here.

FuriouslyAdrift 2 days ago | parent | prev | next [-]

LLMs are lossy compression of a corpus with a really good parser as a front end. As human made content dries up (due to LLM use), the AI products will plateau.

I see inference as the much bigger technology although much better RAG loops for local customization could be a very lucrative product for a few years.

ChaseRensberger 2 days ago | parent | prev [-]

Well said.