Remix.run Logo
dsiegel2275 3 days ago

So I have all kinds of problems with this post.

First, the assertion that the best model of "AI coding" is that it is a compiler. Compilers deterministically map a formal language to another under a spec. LLM coding tools are search-based program synthesizers that retrieve, generate, and iteratively edit code under constraints (tests/types/linters/CI). That’s why they can fix issues end-to-end on real repos (e.g., SWE-bench Verified), something a compiler doesn’t do. Benchmarks now show top agents/models resolving large fractions of real GitHub issues, which is evidence of synthesis + tool use, not compilation.

Second, that the "programming language is English". Serious workflows aren’t "just English." They use repo context, unit tests, typed APIs, JSON/function-calling schemas, diffs, and editor tools. The "prompt" is often code + tests + spec, with English as glue. The author attacks the weakest interface, not how people actually ship with these tools.

Third, non-determinism isn't disqualifying. Plenty of effective engineering tools are stochastic (fuzzers, search/optimization, SAT/SMT with heuristics). Determinism comes from external specs: unit/integration tests, type systems, property-based tests, CI gates.

False dichotomy: "LLMs are popular only because languages/libraries are bad." Languages are improving (e.g. Rust, Typescript), yet LLMs still help because the real bottlenecks are API lookup, cross-repo reading, boilerplate, migrations, test writing, and refactors, the areas where retrieval and synthesis shine. These are complementary forces, not substitutes.

Finally, no constructive alternatives are offered. "Build better compilers/languages" is fine but modern teams already get value by pairing those with AI: spec-first prompts, test-gated edits, typed SDK scaffolds, auto-generated tests, CI-verified refactors, and repo-aware agents.

A much better way to think about AI coding and LLMs is that they aren’t compilers. They’re probabilistic code synthesizers guided by your constraints (types, tests, CI). Treat them like a junior pair-programmer wired into your repo, search, and toolchain. But not like a magical English compiler.

georgehotz 3 days ago | parent | next [-]

Author here. I agree with this comment, but if I wrote more like this my blog post would get less traction.

"LLM coding tools are search-based program synthesizers," in my mind this is what compilers are. I think most compilers do far too little search and opt for heuristics instead, often because they don't have an integrated runtime environment, but it's the same idea.

"Plenty of effective engineering tools are stochastic," sure but while a SAT solver might use randomness and that might adjust your time to solve, it doesn't change the correctness of the result. And for something like a fuzzer, that's a test, which are always more of a best effort thing. I haven't seen a fuzzer deployed in prod.

"Determinism comes from external specs and tests," my dream is a language where I can specify what it does instead of how it does it. Like the concept of Halide's schedule but more generic. The computer can spend its time figuring out the how. And I think this is the kind of tools AI will deliver. Maybe it'll be with LLMs, maybe it'll be something else, but the key is that you need a fairly rigorous spec and that spec itself is the programming. The spec can even be constraint based instead of needing to specify all behavior.

I'm not at all against AI, and if you are using it at a level described in this post, like a tool, aware of its strengths and limitations, I think it can be a great addition to a workflow. I'm against the idea that it's a magical English compiler, which is what I see in public discourse.

noodletheworld 3 days ago | parent | next [-]

I think the key insight I walked away with from this whole thread, for me, was:

A compiler takes source and maps it to some output. Regardless of the compiler detail, this is an atomic operation; you end up with source (unmodified) and an artifact.

These “agent workflows” are distinctly different.

The process of mapping prompt to an output is the same; but these agent workflows are destructive; they modify the source.

Free reign over the entire code base; They modify the tests. The spec, the implementation.

It seems like this is a concept people are still struggling with; if your specification is poorly defined, and is dynamically updated during the compilation process, the results are more than just non deterministic.

Over time, the specification becomes non deterministic.

Thats why unsupervised agents go “off the rails”; not because the specification cant be executed, but because over time the spec drifts.

That doesnt happen with compilers.

johnnyyyy 2 days ago | parent | prev [-]

In your blog post: “Most people do not care to find the truth, they care about what pumps their bags”

in your HN comment: “I agree with this comment, but if I wrote more like this my blog post would get less traction.”

Seems like you also not care about the truth.

georgehotz 2 days ago | parent | next [-]

This is bait. The comment and the blog post say mostly the same thing, the debate is around the subtle edges.

It's not a "compiler," it's a "probabilistic code synthesizer guided by your constraints"

The latter is technically more specific and correct than the former, but it's 7 words instead of 1. And the word compiler is understood to encompass the latter, even if most compilers aren't that. They are both "a tool in a workflow"

neta1337 13 hours ago | parent | prev | next [-]

He cares for the truth by making it accessible to more people.

scubbo 2 days ago | parent | prev [-]

You said it before I could. Amen.

inimino 2 days ago | parent | prev | next [-]

People knock "English as a programming language", but in my opinion this is the whole value of AI programming: by the time you've expressed your design and constraints well enough that an LLM can understand it, then anyone can understand it, and you end up with a codebase that's way more maintainable than what we're used to.

The problem of course is when people throw away the prompt and keep the code, like the code is somehow valuable. This would be like everyone checking in their binaries and throwing away their source code every time, while arguments rage on HN about whether compilers are useful. (Meanwhile, compiler vendors compete on their ability to disassemble and alter binaries in response to partial code snippets.)

The right way to do AI programming is: English defines the program, generated code is exactly as valuable as compiler output is, i.e. it's the actual artifact that does the thing, so in one sense it's the whole point, but iterating on it or studying it in detail is a waste of time, except occasionally when debugging. It's going to take a while, but eventually this will be the only way anybody writes code. (Note: I may be biased, as I've built an AI programming tool.)

If you can explain what needs to be done to a junior programmer in less time than it takes to do it yourself, you can benefit from AI. But, it does require totally rethinking the programming workflow and tooling.

euclidinspace a day ago | parent [-]

I don't think that a prompt can be a valuable object, similar to how code used to be. Unless Mira Murati is successful at scaling her approach to deterministic inference, a prompt is fragile and transient. And even if she is successful, LLM updates make a prompt much less useful over longer time horizons.

I think that the only useful objects to keep right now are DSPy programs together with well-crafted examples, with examples being the most valuable because they are transferable across models and architectures.

I also noticed several people in the thread comparing coding assistants to junior programmers. I disagree. The only parallel is that they will do what you tell them to. Otherwise, a coding assistant can hold an entire codebase in context, reason across patterns, and generate boilerplate faster than any human. That capability has no human analogue. And unlike a junior, they have no agency, so the comparison breaks down on multiple fronts.

mccoyb 3 days ago | parent | prev | next [-]

Excellent response, completely agree.

intothemild 3 days ago | parent | prev [-]

It's not surprising that you're finding problems with the article. It's written by George Hotz aka Geohot.

CamperBob2 3 days ago | parent [-]

I've seen some engaging, thought-provoking, and original thoughts from Geohot in the past, but this post absolutely was not one of them.