▲ | keeda a day ago | |
Huh, I'm surprised that he goes from "No AI" to "AI autocomplete" to "Vibecoding / Agents" (which I assume means no human review per his original coinage of the term.) This seems to preclude the chat-oriented / pair-programming model which I find most effective. Or even the plan-spec-codegen-review approach, which IME works extremely well for straightforward CRUD apps. Also they discuss the nanochat repo in the interview, which has become more famous for his tweet about him NOT vibe-coding it: https://www.dwarkesh.com/i/176425744/llm-cognitive-deficits Things are more nuanced than what people have assumed, which seems to be "LLMs cannot handle novel code". The best I can summarize it as is that he was doing rather non-standard things that confused the LLMs which have been trained on vast amounts on very standard code and hence kept defaulting to those assumptions. Maybe a rough analogy is that he was trying to "code golf" this repo whereas LLMs kept trying to write "enterprise" code because that is overwhelmingly what they have been trained on. I think this is where the chat-oriented / pair-programming or spec-driven model shines. Over multiple conversations (or from the spec), they can understand the context of what you're trying to do and generate what you really want. It seems Karpathy has not tried this approach (given his comments about "autocomplete being his sweet spot".) For instance, I'm working on some straightforward computer vision stuff, but it's complicated by the fact that I'm dealing with small, low-resolution images, which does not seem well-represented in the literature. Without that context, the suggestions any AI gives me are sub-optimal. However, after mentioning it a few times, ChatGPT now "remembers" this in its context, and any suggestion it gives me during chat is automatically tailored for my use-case, which produces much better results. Put another way (not an AI expert so I may be using the terms wrong), LLMs will default to mining the data distribution they've been trained on, but with sufficient context, they should be able to adapt their output to what you really want. |