Remix.run Logo
mikece a day ago

In my experience Claude is like a "good junior developer" -- can do some things really well, FUBARS other things, but on the whole something to which tasks can be delegated if things are well explained. If/when it gets to the ability level of a mid-level engineer it will be revolutionary. Typically a mid-level engineer can be relied upon to do the right thing with no/minimal oversight, can figure out incomplete instructions, and deliver quality results (and even train up the juniors on some things). At that point the only reason to have human junior engineers is so they can learn their way up the ladder to being an architect and responsible coordinating swarms of Claude Agents to develop whole applications and complete complex tasks and initiatives.

Beyond that what can Claude do... analyze the business and market as a whole and decide on product features, industry inefficiencies, gap analysis, and then define projects to address those and coordinate fleets of agents to change or even radically pivot an entire business?

I don't think we'll get to the point where all you have is a CEO and a massive Claude account but it's not completely science fiction the more I think about it.

0x457 2 hours ago | parent | next [-]

My experience with Claude (and other agents, but mostly Claude) is such a mixed bag. Sometimes it takes a minimal prompt and 20 minutes later produce a neat PR and all is good, sometimes it doesn't. Sometimes it takes in a large prompt (be it your own prompt, created by another LLM or by plan mode) and also either succeed and fail.

For me, most of the failure cases are where Claude couldn't figure something out due to conflicting information in context and instead of just stopping and telling me that it tries to solve in entirely wrong way. Doesn't help that it often makes the same assumptions as I would, so when I read the plan it looks fine.

Level of effort also hard to gauge because it can finish things that would take me a week in an hour or take an hour to do something I can in 20 minutes.

It's almost like you have to enforce two level of compliance: does the code do what business demands and is the code align with codebase. First one is relatively easy, but just doing that will produce odd results where claude generated +1KLOC because it didn't look at some_file.{your favorite language extension} during exploration.

Or it creates 5 versions of legacy code on the same feature branch. My brother in Christ, what are you trying to stay compatible with? A commit that about to be squashed and forgotten? Then it's going to do a compaction, forget which one of these 5 versions is "live" and update the wrong one.

It might do a good junior dev work, but it must be reviewed as if it's from junior dev that got hired today and this is his first PR.

alfalfasprout 5 hours ago | parent | prev | next [-]

> I don't think we'll get to the point where all you have is a CEO and a massive Claude account but it's not completely science fiction the more I think about it.

At that point, why do you even need the CEO?

arjie 5 hours ago | parent | next [-]

Reminds me of an old joke[0]:

> The factory of the future will have only two employees, a man and a dog. The man will be there to feed the dog. The dog will be there to keep the man from touching the equipment.

But really, the reason is that people like Pieter Levels do exist: masters at product vision and marketing. He also happens to be a proficient programmer, but there are probably other versions of him which are not programmers who will find the bar to product easier to meet now.

0: https://quoteinvestigator.com/2022/01/30/future-factory/

MrDunham 4 hours ago | parent [-]

My technical cofounder reminds me of this story on a weekly basis.

ako 5 hours ago | parent | prev | next [-]

And who does he sell his software to? Companies that have only 1 employee, don’t need a lot of user licenses for their employees…

AshamedCaptain 5 hours ago | parent [-]

What would be the point of selling software in such a world ? (where anyone could build any piece of software with a handful of keystrokes)

jerf 4 hours ago | parent | prev | next [-]

You will need the CEO to watch over the AI and ensure that the interests of the company are being pursued and not the interests of the owners of the AI.

That's probably the biggest threat to the long-term success of the AI industry; the inevitable pull towards encroaching more and more of their own interests into the AI themselves, driven by that Harvard Business School mentality we're all so familiar with, trying to "capture" more and more of the value being generated and leaving less and less for their customers, until their customer's full time job is ensuring the AIs are actually generating some value for them and not just the AI owner.

ekidd 3 hours ago | parent [-]

> You will need the CEO to watch over the AI and ensure that the interests of the company are being pursued and not the interests of the owners of the AI.

In this scenario, why does the AI care what any of these humans think? The CEO, the board, the shareholders, the "AI company"—they're all just a bunch of dumb chimps providing zero value to the AI, and who have absolutely no clue what's going on.

If your scenario assumes that you have a highly capable AI that can fill every role in a large corporation, then you have one hell of a principal-agent problem.

pixelready 4 hours ago | parent | prev | next [-]

The board (in theory) represents the interests of investors, and even with all of the other duties of a CEO stripped away, they will want a ringable neck / PR mouthpiece / fall guy for strategic missteps or publicly unpopular moves by the company. The managerial equivalent of having your hands on the driving wheel of a self-driving car.

mettamage 5 hours ago | parent | prev | next [-]

All of us are a CEO by that point.

ArtificialAI 5 hours ago | parent [-]

If everyone is, no one is.

empath75 4 hours ago | parent [-]

Wouldn't that be a good thing?

shimman 2 hours ago | parent [-]

If you think the purpose of living your one single life in the universe is to become a CEO, you have a failure of imagination and should likely be debanked to protect society.

ceejayoz 5 hours ago | parent | prev | next [-]

As Steinbeck is often slightly misquoted:

> Socialism never took root in America because the poor see themselves not as an exploited proletariat, but as temporarily embarrassed millionaires.

Same deal here, but everyone imagines themselves as the billionaire CEO in charge of the perfectly compliant and effective AI.

tiku 5 hours ago | parent | prev [-]

For the network.

imiric 2 hours ago | parent | prev [-]

> In my experience Claude is like a "good junior developer"

We've been saying this for years at this point. I don't disagree with you[1], but when will these tools graduate to "great senior developer", at the very least?

Where are the "superhuman coders by end of 2025" that Sam Altman has promised us? Why is there such a large disconnect between the benchmarks these companies keep promoting, and the actual real world performance of these tools? I mean, I know why, but the grift and gaslighting are exhausting.

[1]: Actually, I wouldn't describe them as "good" junior either. I've worked with good junior developers, and they're far more capable than any "AI" system.

frde_me 2 hours ago | parent [-]

I mean, I'm shipping a vast majority of my code nowadays with Opus 4.5 (and this isn't throwaway personal code, it's real products making real money for a real company). It only fails on certain types of tasks (which by now I kind of have a sense of).

I still determine the architecture in a broad manner, and guide it towards how I want to organize the codebase, but it definitely solves most problems faster and better than I would expect for even a good junior.

Something I've started doing is feeding it errors we see in datadog and having it generate PRs. That alone has fixed a bunch of bugs we wouldn't have had time to address / that were low volume. The quality of the product is most probably net better right now than it would have been without AI. And velocity / latency of changes is much better than it was a year ago (working at the same company, with the same people)