Remix.run Logo
chucklenorris 7 hours ago

1 is definitely false right now. I gave specs, tests, full datasets, reference code to translate to an llm and still produce garbage code/fall flat on it's face. I just spent one week translating a codebase from go to cpp and i had to throw the whole thing out because it put in some horrible bugs that it could not fix even burning 500$ worth of tokens and me babysitting it. As i said it had everything at it's disposal: tests, reference impl, lots of data to work with. I finally got my lazy ass to inplement it and lo and behold i did it in 2 days with no bugs (that i know of) and the code quality is miles better than that undigested vomit. The codebase was a protocol library for decoding network traffic that used a lot of bit twiddling, flow control, huffman table compression, mildly complicated stuff. So no - if you want working non-trivial code that you can rely on then definitely don't use a llm to do it. Use it for autocomplete, small bits of code but never let the damn thing do the thinking for you.

pron 6 hours ago | parent [-]

Oh, I agree. Anthropic themselves proved that even with a full spec and thousands of human-crafted tests, unsupervised agents couldn't produce even something as relatively simple as a workable C compiler, even when the model was trained on the spec, tests, the theory, a reference implementation, and even when given a reference implementation as an oracle.

But my point was that I don't think the development of Claude Code itself isn't supervised, hence it's not really "vibe coded".