Remix.run Logo
angarg12 2 hours ago

> This exact thing is what software developers have been begging for since the beginning of the profession: Receiving a detailed outline of the problem and what the end result should look like.

> This is often the part that slows down software development. Trying to figure out what a vague, title only, feature request actually means.

But that is exactly what Software Engineering is!. It's 2026 and the notion that you can get detailed enough requirements and specifications that you can one-shot a perfect solution needs to die.

In my experience AI has made us able to iterate on features or ideas much faster. Now most of the friction comes from alignment and coordination with other teams. My take is that to accelerate processes we should reduce coordination overhead and empower individuals and teams to make decisions and execute on them.

Philip-J-Fry 28 minutes ago | parent | next [-]

I don't agree.

I regularly get pieces of work someone product guy has thought up in an afternoon. They only care about the happy path, and sometimes only part of the happy path. I work for a global company that has to abide by rules and regulations in each country we operate in. The product guy thinks up some feature, we implement the feature, then we're told "actually, we legally aren't allowed to do this in 90% of the markets we operate in". Cool, so we add an ability to disable it in those markets. Then they come back "We can do this in some of those markets if it's implemented with [regulatory bureaucracy], so can you do that please".

Then we have to hack away at the solution because the deadline is right around the corner.

This is not software engineering! None of this is related to the software. The job of a software engineer is to take a list of requirements and figure out the way we accomplish those requirements. Requirements gathering is NOT a software engineering problem. Software is implementation, product is behaviour. That's the split. The behaviour of the thing we're building needs to be known before we even try to seriously build it.

If someone just held back for week and did their due diligence, we would been able to architect a solution that is scaleable, extensible, easy to maintain and can make the future easier.

pron 2 hours ago | parent | prev | next [-]

> It's 2026 and the notion that you can get detailed enough requirements and specifications that you can one-shot a perfect solution needs to die.

It's 2026 and the idea that even with detailed-enough requirements you can one-shot even a workable (let alone perfect) solution also needs to die. Anthropic failed to build even something as simple as a workable C compiler, not only with a perfect spec (and reference implementations, both of which the model trained on) but even with thousands of tests painstakingly written over many person-years. Today's models are not yet capable enough to build non-trivial production software without close and careful human supervision, even with perfect specs and perfect tests. Without a perfect spec and a perfect human-written test suite the task is even harder. Maybe in 2027.

ianbutler 2 hours ago | parent | next [-]

Sorry where are we seeing that it failed? It compiled multiple projects successfully albeit less optimized.

" It lacks the 16-bit x86 compiler that is necessary to boot Linux out of real mode. For this, it calls out to GCC (the x86_32 and x86_64 compilers are its own).

It does not have its own assembler and linker; these are the very last bits that Claude started automating and are still somewhat buggy. The demo video was produced with a GCC assembler and linker.

The compiler successfully builds many projects, but not all. It's not yet a drop-in replacement for a real compiler. The generated code is not very efficient. Even with all optimizations enabled, it outputs less efficient code than GCC with all optimizations disabled.

The Rust code quality is reasonable, but is nowhere near the quality of what an expert Rust programmer might produce. "

For faffing about with a multi agent system that seems like a pretty successful experiment to me.

Source: https://www.anthropic.com/engineering/building-c-compiler

Edit: Like I think people don't realize not even 7 months ago it wasn't writing this at all.

pron an hour ago | parent | next [-]

> where are we seeing that it failed?

Anthropic said the experiment failed to produce a workable C compiler:

- I tried (hard!) to fix several of the above limitations but wasn’t fully successful. New features and bugfixes frequently broke existing functionality.

- The compiler successfully builds many projects, but not all. It's not yet a drop-in replacement for a real compiler.

(source: https://www.anthropic.com/engineering/building-c-compiler)

Software that cannot be evolved is dead software. That in some PR communications they misrepresented their own engineer's report is beside the point.

> It compiled multiple projects successfully albeit less optimized.

150,000x slower (https://github.com/harshavmb/compare-claude-compiler) is not "less optimised". It's unworkable.

> Like I think people don't realize not even 7 months ago it wasn't writing this at all.

There's no doubt that producing a C compiler that isn't workable and is effectively bricked as it cannot be evolved but still compiles some programs is great progress, but it's still a long way off of auonomously building production software. Can today's LLM do amazing things and offer tremendous help in software development? Absolutely. Can they write production software without careful and close human supervision? Not yet. That's not disparagement, just an observation of where we are today.

ianbutler 25 minutes ago | parent [-]

> Can they write production software without careful and close human supervision? Not yet. That's not disparagement, just an observation of where we are today.

I never claimed they could! I just view this as a successful experiment. I don't think anthropic was making that claim with their experiment either.

It feels reflexive to the moment to argue against that claim, but I tend to operate with a bit more nuance than "all good" or "all bad".

dnautics 2 hours ago | parent | prev [-]

Yeah I think people are really underestimating what LLMs can do even without specs.

As an example, I did an exploratory attempt to add custom software over some genuinely awful windows software for a scientific imaging station with a proprietary industrial camera. Five days later Claude and I had figured out how to USB-pcap sample images and it's operationalized and smoothly running for months now. 100% of the code written by Claude, it's all clean (reviewed it myself) pretty much all I did was unstuck it at a few places, "hey based on the file sizes it looks like the images are being sent as a 16-bit format")

For day to day work, I'll often identify a bug, "hey, when I shift click on this graphical component, it's not doing the right thing". I go tell Claude to write a RED (failing) integration test, then make it pass.

Zero lines of code manually written. Only occasionally do I have to intervene and rearchitect. Usually thus involves me writing about ten lines of scaffold code, explaining the architectural concept, and telling it to just go

pron an hour ago | parent [-]

People are both underestimating and overestimating what LLMs can do. LLMs have shown very different results when autonomously writing a small program for personal use and autonomously writing large production software that needs to be evolved for years.

SirHumphrey 2 hours ago | parent | prev [-]

Most software is much simpler than a c compiler.

pron 2 hours ago | parent | next [-]

A workable C compiler is a ~10-50KLOC program, and a fairly simple one at that (batch, with no concurrency or interaction). That Anthropic's swarm of agents wrote 100KLOC before failing is a symptom of the problem. It's certainly possible that many programs are in the sub 5KLOC range, but it's definitely not "most software". Plus, almost no software has this level of detailed spec, ready-made tests, and a selection of existing implementations of the same spec.

My first thought when reading Anthropic's description of the experiment was that it is unrealistically easy. It's hard to come up with realistic jobs in the 10-50KLOC range that would be this easy for an LLM. That it failed only shows how much further we still have to go.

quantumleaper 2 hours ago | parent | next [-]

A bit off topic, but see how Anthropic publicity stunts went from "Claude C Compiler" with 100K LOC to the recent Bun Rust rewrite with 1M LOC (10x!) in just 3 months.

I get that it's "novel" creation vs porting, but given that they reported that the C compiler cost them $20k in API costs, the Bun rewrite must be at least $200k, maybe even closer to a million. Pure madness.

gmueckl 2 hours ago | parent | next [-]

Asking an LLM tp change programming language of an implementation is completely different from asking it to code from spec. It's orders of magnitude simpler in practice. I converted some 60kloc of Java to C++ and it works. There were some issues where the Java implementation used runtime reflection because that needs creative workarounds and not all of the C++ translations worked on the first try. And that was my first serious attempt at a task with an LLM. I could likely do better now. An important task simplification here is that a well designed codebase can be converted in small pieces and then joined back together. So the total amount of code converted becomes an irrelevant metric.

pron 2 hours ago | parent | prev [-]

Yes, the task is very different, but also it will be months to a year until we know the results of the bun experiment.

quantumleaper 2 hours ago | parent [-]

I don't know how it could fail - Bun loses popularity among devs? Is it an objective metric? From what I understand, Node.js remains dominant across the industry as a whole, with Deno and Bun mostly used by startups.

Anthropic can always fire the Opus/Mythos token machine gun on any problem (bugs, features, security) to ensure PR success, and there would be plenty of AI-sphere startups already drinking the kool-aid that would consider the whole vibe-coding thing to Bun's benefit.

pron an hour ago | parent | next [-]

> Anthropic can always fire the Opus/Mythos token machine gun on any problem (bugs, features, security) to ensure PR success,

Can they, though? They tried and failed to do it in their C compiler experiment. The experimenter wrote: "I tried (hard!) to fix several of the above limitations but wasn’t fully successful. New features and bugfixes frequently broke existing functionality."

eudamoniac an hour ago | parent | prev [-]

It could fail due to maintenance burden. There is a lot of code now that no one wrote.

rowanG077 an hour ago | parent | prev [-]

The compiler that claude made went way beyond workable. It could compile the full linux kernel afaik. That is much further even beyond standard C.

pron 37 minutes ago | parent [-]

People who independently tried to use it reported that it is very much not workable:

- "CCC compiled every single C source file in the Linux 6.9 kernel without a single compiler error (0 errors, 96 warnings). This is genuinely impressive for a compiler built entirely by an AI. However, the build failed at the linker stage with ~40,784 undefined reference errors."(https://github.com/harshavmb/compare-claude-compiler)

- Overall it’s an interesting experiment, and shows the current bleeding edge of Claude’s Opus 4.6 model. However the resulting product is also a clear example of the throwaway nature of projects generated almost entirely by AI code agents with little human oversight. The prototype is really impressive, but there is no real path forward for it to be further developed. It can build the Linux kernel [for RISC-V], which is impressive. It can also build other things… if you are lucky, but you really cannot rely on it to work. (https://voxelmanip.se/2026/02/06/trying-out-claudes-c-compil...)

Anthropic themselves said that the codebase was effectively bricked and that their agents could not salvage it.

binary0010 2 hours ago | parent | prev [-]

Not really.

I can make a c compiler in a couple weeks just by looking up open source libraries and copying them.

I can't make any software that people will pay me money to use without taking months/years of development, research, expiramentation and iteration.

Just because the original people who invented compilers had to be genius, doesn't mean anyone has to spend much time or thought in copying that work now.

2 hours ago | parent [-]
[deleted]
juanre an hour ago | parent | prev | next [-]

I completely agree. It's more than 40 years since I wrote my first program, and I've never seen software that was first specified and then written and all was good.

The most difficult part of any non-trivial engineering is understanding the problem, and the first versions of a piece of software are how you reach that understanding.

That's why I do not think that AI-powered "software factories" will ever work. It's waterfall development all over again. An architect writing UML diagrams and handing them off to the team of programmers to do the essentially mundane task of implementing... the wrong thing.

AI is, however, very good at helping you go fast from the wrong first version to the less wrong second one. But you need to remember that your main task is to understand the problem that you are trying to solve.

Cthulhu_ 2 hours ago | parent | prev | next [-]

I'm seeing decision-makers / people who write requirements starting to use AI as well in my day to day. As before, my job is to read, understand and test those requirements against the real world as I understand it. But same with code. Software engineering for the past (at least) 20 years has had a core focus of "don't trust anyone", this hasn't changed and this takes a lot of time and effort still.

Terr_ 2 hours ago | parent [-]

The problem is that instead of trying to figure out what they really want/need, now we're trying to figure out what they really wanted or needed before it got obfuscated by the babble-machine.

harrall 2 hours ago | parent | prev | next [-]

Trying to figure out the best way to solve vague requirements is why I got into engineering.

If I got detailed specs, I’d just be a coding robot. I push that work off onto juniors.

mmcnl an hour ago | parent | prev | next [-]

This is true, but funny thing is: it was also true before AI.

stingraycharles 2 hours ago | parent | prev | next [-]

Yeah I agree, such a fundamental aspect of software engineering is translating ambiguous “asks” into specific requirements. We now have a tool to convert those requirements directly into code.

And yes, architecture and how to actually implement the designs are also part of the requirements.

The code is just the implementation, the actual problem that needs solving is one abstraction level higher.

gedy an hour ago | parent | prev | next [-]

> Trying to figure out what a vague, title only, feature request actually means.

> My take is that to accelerate processes we should reduce coordination overhead and empower individuals and teams to make decisions and execute on them.

This is funny because it's exactly what the agile/scrum training taught me 20 years ago.

ModernMech 2 hours ago | parent | prev [-]

It's UML and outsourcing all over again: If only we can write the perfect UML diagrams representing the ideal class hierarchy, we can just put that in an email, send it to India, then we'll get back exactly the program we wanted, no mistakes!