| ▲ | simonw a day ago |
| It looks like Andrej's definition of "agent" here is an entity that can replace a human employee entirely - from the first few minutes of the conversation: When you’re talking about an agent, or what the labs have in mind and maybe what I have in mind as well, you should think of it almost like an employee or an intern that you would hire to work with you. For example, you work with some employees here. When would you prefer to have an agent like Claude or Codex do that work? Currently, of course they can’t. What would it take for them to be able to do that? Why don’t you do it today? The reason you don’t do it today is because they just don’t work. They don’t have enough intelligence, they’re not multimodal enough, they can’t do computer use and all this stuff. They don’t do a lot of the things you’ve alluded to earlier. They don’t have continual learning. You can’t just tell them something and they’ll remember it. They’re cognitively lacking and it’s just not working. It will take about a decade to work through all of those issues. |
|
| ▲ | sarchertech a day ago | parent | next [-] |
| He’s not just talking about agents good enough to replace workers. He’s talking about whether agents are currently useful at all. >Overall, the models are not there. I feel like the industry is making too big of a jump and is trying to pretend like this is amazing, and it’s not. It’s slop. They’re not coming to terms with it, and maybe they’re trying to fundraise or something like that. I’m not sure what’s going on, but we’re at this intermediate stage. The models are amazing. They still need a lot of work. For now, autocomplete is my sweet spot. But sometimes, for some types of code, I will go to an LLM agent. >They kept trying to mess up the style. They’re way too over-defensive. They make all these try-catch statements. They keep trying to make a production code base, and I have a bunch of assumptions in my code, and it’s okay. I don’t need all this extra stuff in there. So I feel like they’re bloating the code base, bloating the complexity, they keep misunderstanding, they’re using deprecated APIs a bunch of times. It’s a total mess. It’s just not net useful. I can go in, I can clean it up, but it’s not net useful. |
| |
| ▲ | sothatsit 21 hours ago | parent | next [-] | | I don't think he is saying agents are not useful at all, just that they are not anywhere near the capability of human software developers. Karpathy later says he used agents to write the Rust translation of algorithms he wrote in Python. He also explicitly says that agents can be useful for writing boilerplate or for code that can be very commonly found online. So I don't think he is saying they are not useful at all. Instead, he is just holding agents to a higher standard of working on a novel new codebase, and saying they don't pass that bar. Tbh I think people underestimate how much software development work is just writing boilerplate or common patterns though. A very large percentage of the web development work I do is just writing CRUD boilerplate, and agents are great at it. I also find them invaluable for searching through large codebases, and for basic code review, but I see these use-cases discussed less even though they're a big part of what I find useful from agents. | | |
| ▲ | sarchertech 18 hours ago | parent | next [-] | | I’m not saying he’s saying agents aren’t useful at all. It’s literally in the quotes I provided that he says they are useful for some subset of tasks. I’m saying that he is answering the question “are agents useful at all”. not “can agents replace humans”. His answer is mostly not. He generally prefers autocomplete. But they are useful for some limited tasks. | | |
| ▲ | weatherlite 15 hours ago | parent [-] | | > I’m not saying he’s saying agents aren’t useful at all I'm not saying you're saying he's saying agents aren't useful at all | | |
| ▲ | sarchertech 14 hours ago | parent [-] | | You’re not the person I’m replying to. The person I’m replying to said >I don't think he is saying agents are not useful at all, just that they are not anywhere near the capability of human software developers. Implying I was supporting the first clause. |
|
| |
| ▲ | CaptainOfCoit 20 hours ago | parent | prev [-] | | My biggest takeaway is that agents/LLMs in general are super helpful when paired together with a human who knows the inside and out of software development, who uses it side-by-side with their normal work. They start being less useful when you start treating them as "I can send them ill-specified stuff, ignore them for 10 minutes and merge their results", as things spiral out of control. Basically "vibe-coding" as a useful concept doesn't work for projects you need to iterate on, only for things you feel OK with throwing away eventually. Augmenting the human intellect with LLMs? Usually a increase in productivity. Replacing human coworkers with LLMs? Good luck, have fun. | | |
| ▲ | rhetocj23 10 hours ago | parent [-] | | It does seem pretty clear that an individual who possess super high quality human capital, paired with something like an LLM (provided the LLM is good enough relative to the individual) can be a powerful combination. The issues are: 1) There isnt enough supply of those individuals
2) Such an LLM of that kind doesnt exist (at least not in consistent nature)
3) The amount invested into what is going on will not yield returns commensurate to the required rate of return Interestingly enough, I believe Andrej Karpathy is also focusing on education (levelling up the supply of human capital) - I came to the above conclusion about a month ago. And it 'feels' right to me. |
|
| |
| ▲ | consumer451 a day ago | parent | prev | next [-] | | I am just some shmoe, but I agree with that assessment. My biggest take-away is that we got super lucky. At least now we have a slight chance to prepare for the potential economic and social impacts. | | |
| ▲ | Bengalilol a day ago | parent [-] | | I am thinking the same. And we should start considering on what makes us humans and how we can valorize our common ground. | | |
| ▲ | tablatom 21 hours ago | parent [-] | | This. I believe it’s the most important question in the world right now. I’ve been thinking long and hard about this from an entirely practical perspective and have surprised myself that the answer seems to be our capacity to love. The idea is easily dismissed as romantic but when I say I’m being practical I really mean it. I’m writing about it here https://giftcommunity.substack.com/ | | |
|
| |
| ▲ | kubb a day ago | parent | prev [-] | | My ever growing reporting chain is incredibly invested in having autonomous agents next year. |
|
|
| ▲ | eddiewithzato a day ago | parent | prev | next [-] |
| Because that's the definition that is leading to all these investments, the promise that very soon they will reach it. If Altman said plainly that LLMs will never reach that stage, there would be a lot less investment into the industry. |
| |
| ▲ | aik a day ago | parent [-] | | Hard disagree. You don’t need AGI to transform countless workflows within companies, current LLMs can do it. A lot of the current investments are to help with the demand with current generation LLMs (and use cases we know will keep opening up with incremental improvements). Are you aware of how intensely all the main companies that host leading models (azure, aws, etc) are throttling usage due to not enough data center capacity? (Eg. At my company we have 100x more demand than we can get capacity for, and we’re barely getting started. We have a roadmap with 1000x+ the current demand and we’re a relatively small company.) AGI would be more impactful of course, and some use cases aren’t possible until we have it, but that doesn’t diminish the value of current AI. | | |
| ▲ | kllrnohj a day ago | parent | next [-] | | > Eg. At my company we have 100x more demand than we can get capacity for, and we’re barely getting started. We have a roadmap with 1000x+ the current demand and we’re a relatively small company. OpenAI's revenue is $13bn with 70% of that coming from people just spending $20/mo to talk to ChatGPT. Anthropic is projecting $9bn in revenue in 2025. For nice cold splash of reality, fucking Arizona Iced Tea has $3bn in revenue (also that's actual revenue not ARR) You might have 100x more demand than you can get capacity for, but if that 100x still puts you at a number that in absolute terms is small, it's not very impressive. Similarly if you're already not profitable and achieving 100x growth requires 1,000x in spend, that's also not a recipe for success. In fact it's a recipe for going bankrupt in a hurry. | | |
| ▲ | aik 16 hours ago | parent | next [-] | | I have no idea if OpenAI’s valuation is reasonable. All I’m saying is I’m convinced the demand is there, even without AGI around the corner. You do not need AGI to transform countless industries. And we are profitable on our AI efforts while adding massive value to our clients. I know less about OpenAI’s economics, I know there are questions on whether their model is sustainable/for how long. I am guessing they are thinking about it and have a plan? | |
| ▲ | hyperadvanced a day ago | parent | prev [-] | | This is correct, it should burn the retinas of anyone thinking that OAI or Anthropic are in any way worth their multi-billion dollar valuations. I liked AK’s analysis of AI for coding here (it’s overly defensive, lacks style and functionality awareness, is a cargo cultist, and/or just does it wrong a lot) but autocomplete itself is super valuable, as is the ability to generate simple frontend code and let you solve the problem of making a user interface without needing a team of people with those in-house skills. | | |
| ▲ | vharish 19 hours ago | parent [-] | | There are many more use cases that aren't fully realised yet. With regards to coding, LLMs have shortcomings. However, there's a lot of work that can be automated. Any work that requires interaction with a computer can eventually be automated to some extent. To what extent is something only time can tell. | | |
| ▲ | hyperadvanced 16 hours ago | parent [-] | | Sure, but you don’t need AI to automate computer work. You can make a career out of formalizing the kinds of excel-jockeying that people do for reports or data entry |
|
|
| |
| ▲ | bloppe a day ago | parent | prev | next [-] | | This is a relatively reasonable take. Unfortunately, that's not what most AI investors or non-technical punters think. Since GPT 1 it's been all about unlocking 100%+ annual GDP growth by wholesale white collar automation. I agree with AK that the actually effect on GDP will be more or less negligible, which will be an unmitigated disaster for us economically given how much cash has already been incinerated | |
| ▲ | a day ago | parent | prev | next [-] | | [deleted] | |
| ▲ | Culonavirus a day ago | parent | prev [-] | | Oh look, people with skin in the AI game insist AI is not a massive bubble. More news at 11. | | |
| ▲ | aik 16 hours ago | parent [-] | | We’re a regular old SaaS company that has figured out how to add massive value using AI. I am making no statements about valuations and bubbles. I’m actually guessing there is some bubble / overhype. That doesn’t mean it isn’t still incredibly valuable. | | |
| ▲ | rhetocj23 15 hours ago | parent [-] | | Link? And explain in detail, incrementally, what you have done so we can analyse it for ourselves? |
|
|
|
|
|
| ▲ | bbor a day ago | parent | prev | next [-] |
| Quite telling -- thanks for the insightful comment as always, Simon. Didn't know that, even though I've been discussing this on and off all day on Reddit. He's a smart man with well-reasoned arguments, but I think he's also a bit poisoned by working at such a huge org, with all the constraints that comes with. Like, this: You can’t just tell them something and they’ll remember it.
It might take a decade to work through this issue if you just want to put a single LLM in a single computer and have it be a fully-fledged human, sure. And since he works at a company making some of the most advanced LLMs in the world, that perspective makes sense! But of course that's not how it's actually going to be (/already is).LLMs are a necessary part of AGI(/"agents") due to their ability to avoid the Frame Problem[1], but they're far from the only needed thing. We're pretty dang good at "remembering things" with computers already, and connecting that with LLM ensembles isn't going to take anywhere close to 10 years. Arguably, we're already doing it pretty darn well in unified systems[2]... If anyone's unfamiliar and finds my comment interesting, I highly recommend Minsky's work on the Society of Mind, which handled this topic definitively over 20 years ago. Namely; A short summary of "Connectionism and Society of Mind" for laypeople at DARPA: https://apps.dtic.mil/sti/tr/pdf/ADA200313.pdf A description of the book itself, available via Amazon in 48h or via PDF: https://en.wikipedia.org/wiki/Society_of_Mind By far my favorite paper on the topic of connectionist+symbolist syncreticism, though a tad long: https://www.mit.edu/~dxh/marvin/web.media.mit.edu/~minsky/pa... [1] https://plato.stanford.edu/entries/frame-problem/ [2] https://github.com/modelcontextprotocol/servers/tree/main/sr... |
| |
| ▲ | erichocean 17 hours ago | parent [-] | | > You can’t just tell them something and they’ll remember it. I find it fascinating that this is the problem people consistently think we're a decade away on. If you can't do this, you don't have employee-like AI agents, you have AI-enhanced scripting. It's basically the first thing you have to be able to do to credibly replace an actual human employee. |
|
|
| ▲ | ambicapter 14 hours ago | parent | prev [-] |
| Do you have a comment? Most of what you've said here is a quote. |
| |
| ▲ | simonw 10 hours ago | parent [-] | | This is part of my wider hobby of collecting definitions of "agents" - you can see more in my collection here: https://simonwillison.net/tags/agent-definitions/ In this case the specific definition matters because the title of the HN submission is "it will take a decade to work through the issues with agents." |
|