Remix.run Logo
lateforwork 8 hours ago

These days there's an even easier way to learn to write a compiler. Just ask Claude to write a simple compiler. Here's a simple C compiler (under 1500 lines) written by Claude: https://github.com/Rajeev-K/c-compiler It can compile and run C programs for sorting and searching. The code is very readable and very easy to understand.

voidfunc 7 hours ago | parent | next [-]

For those of us that learn better by taking something and tinkering with it this is definitely the better approach.

Ive never been a good book learner but I love taking apart and tinkering with something to learn. A small toy compiler is way better than any book and its not like the LLM didnt absorb the book anyways during training.

lateforwork 7 hours ago | parent [-]

Exactly! Writing a compiler is not rocket science if you know assembly language. You can pick up the gist in an hour or two by looking at a simple toy compiler.

LLMCodeAuditor 5 hours ago | parent | prev | next [-]

I did not and will not run this on my computer but it looks like while loops are totally broken; note how poor the test coverage is. This is just my quick skimming of the code. Maybe it works perfectly and I am dumber than a computer.

Regardless, it is incredibly reckless to ask Claude to generate assembly if you don't understand assembly, and it's irresponsible to recommend this as advice for newbies. They will not be able to scan the source code for red flags like us pros. Nor will they think "this C compiler is totally untrustworthy, I should test it on a VM."

lateforwork 4 hours ago | parent | next [-]

Are you concerned that the compiler might generate code that takes over your computer? If so the provided Dockerfile runs the generated code in a container.

Regarding test coverage, this is a toy compiler. Don't use it to compile production code! Regarding while loops and such, again, this is a simple compiler intended only to compile sort and search functions written in C.

LLMCodeAuditor 4 hours ago | parent [-]

No, the problem is much more basic than "taking over your computer," it looks like the compiler generates incorrect assembly. Upon visual inspection I found a huge class of infinite loops, but I am sure there are subtle bugs that can corrupt running user/OS processes... including Docker, potentially. Containerization does not protect you from sloppy native code.

> Don't use it to compile production code!

This is an understatement. A more useful warning would be "don't use it to compile any code with a while loop." Seriously, this compiler looks terrible. Worse than useless.

If you really want AI to make a toy compiler just to help you learn, use Python or Javascript as a compilation target, so that the LLM's dumb bugs are mostly contained, and much easier to understand. Learn assembly programming separately.

lateforwork 3 hours ago | parent [-]

You have not provided any evidence that can be refuted, only vague assertions.

The compiler is indeed useless for any purpose other than learning how compilers work. It has all the key pieces such as a lexer, abstract syntax tree, parser, code generator, and it is easy to understand.

If the general approach taken by the compiler is wrong then I would agree it is useless even for learning. But you are not making that claim, only claiming to have found some bugs.

LLMCodeAuditor 3 hours ago | parent [-]

The thing that is obviously and indisputably wrong, terrible for learners, is the test cases. They are woefully insufficient, and will not find those infinite loops I discovered upon reading the code. The poor test coverage means you should assume I am correct about the LLM being wrong! It is rude and insulting to demand I provide evidence that some lazy vibe-coded junk is in fact bad software. You should be demanding evidence that the project's README is accurate. The repo provides none.

The code quality is of course unacceptably terrible but there is no point in reviewing 1500 lines of LLM output. A starting point: learners will get nothing out of this without better comments. I understand what's going on since this is all Compilers 101. But considering it's a) stingily commented and b) incorrect, this project is 100% useless for learners. It's indefensible AI slop.

lateforwork 3 hours ago | parent [-]

Sorry I disagree. I have written compilers by hand and this compiler generated by Claude is pretty good for learning.

I am only asking you to backup your own assertions. If you can't then I would have to assume that you are denigrating AI because you are threatened by it.

2 hours ago | parent | next [-]
[deleted]
LLMCodeAuditor 2 hours ago | parent | prev [-]

This comment is not worth responding to: https://news.ycombinator.com/newsguidelines.html#comments

I pointed to two very specific problems with this project, and also directly mentioned that assembly generation for while loops was broken. It's not a technical report but it's a lot of substance for an HN comment. Instead of addressing any of that substance, you resorted to an insulting ad hominem. You are thoughtlessly rejecting contrary evidence, then pretending I'm not providing any. It's straight out of the Flat Earther playbook.

lateforwork 2 hours ago | parent [-]

You claimed bugs, and when asked for evidence of said bugs, you said it is rude to ask for evidence, and I should simply "assume" you are right. Okay. I think people can make up their own minds as to what that means.

5 hours ago | parent | prev [-]
[deleted]
angusturner 7 hours ago | parent | prev [-]

why read that, vs an actually well-written compiler though?

lateforwork 7 hours ago | parent [-]

Because an actual compiler would be tens of thousands of lines and most of it is going to be perf optimization. If you want to get the big picture first, read a simple working compiler that has all the key parts, such as a lexer, abstract syntax tree, parser, code generator and so on.