| |
| ▲ | aspenmartin 2 hours ago | parent | next [-] | | > "in 6 months" that AI will be doing all of SWE work I assume this is the quote you're referring to from Davos? "I have engineers within Anthropic who say I don’t write any code anymore. I just let the model write the code, I edit it. I do the things around it… we might be six to twelve months away from when the model is doing most, maybe all of what SWEs do end to end." that was in Jan, he said "might" and he said 6-12 months. Yes! Let's hold him accountable for saying reasonable things! | | |
| ▲ | hansmayer 2 hours ago | parent [-] | | Reasonable things? He said the same shit over and over over the last several years. Yes, lets hold him accountable - you don't make such "oopsies" accidentally, several times in a row. | | |
| ▲ | aspenmartin an hour ago | parent [-] | | Seems pretty reasonable to me. Timescales are hard for anyone to predict. He is forced to do these predictions to know how much compute to buy in advance. Surprisingly, he underbought compute and now has to scramble to secure it from xAI or wherever he can. So he was overly conservative... | | |
| ▲ | hansmayer an hour ago | parent [-] | | > Timescales are hard for anyone to predict Indeed. That's why serious people are very careful, even if they are not running a company supposedly worth 1T USD > He is forced to do these predictions to know how much compute to buy in advance Ah well, that explains it. For my companies next quarter, I'll just pull some random numbers out of my ass so we can make plans with material impact to company business based on that. | | |
| ▲ | aspenmartin an hour ago | parent [-] | | > That's why serious people are very careful, even if they are not running a company supposedly worth 1T USD 10x revenue growth per year, even more this year...his predictions about when agents will claim SWE e2e work are his speculations, relevant because people care about what he thinks as he is closer than anyone to the leading edge of the technology. It's also important for him to be as accurate as he can about this because he has to put his money where his mouth is. He has to sign the right amount of compute otherwise he screws himself. He got it wrong in the opposite direction that you're implying, so at this point it sounds like you are more interested in your axe to grind than the truth on the ground. You think enterprises are adopting CC because they think "oh this will replace my SWEs I can fire them"? That's not happening at major companies. They buy CC because it's useful and the writing is so clearly on the wall in so many data points that to suggest otherwise is a bit silly at this point. > For my companies next quarter, I'll just pull some random numbers out of my ass so we can make plans with material impact to company business based on that. You, as a leader of a company, don't have to make predictions? Don't have to make bets about what the best thing for you to do next year? That must be incredibly nice. Amodei and everyone else need to plan compute and plan their products and roadmap. You want him to....not do that? | | |
| ▲ | hansmayer 20 minutes ago | parent [-] | | > 10x revenue growth per year To the stunning tune of 5B in the lifetime . > You think enterprises are adopting CC because they think "oh this will replace my SWEs I can fire them"? Yeah, that's actually Darios main talking point > They buy CC because it's useful and the writing is so clearly on the wall in so many data points that to suggest otherwise is a bit silly at this point Right, really sound arguments - writing is "clearly on the wall" and there are "so many data points". I'd be keen to use those immediately, but I am kind of missing the key of the "many data points" - namely, what did you build with it and how much ARR is it generating? > You, as a leader of a company, don't have to make predictions I have to make predictions, but not confabulations, lies and idiocies. > Amodei and everyone else need to plan compute FOR WHAT? Again, what was built with their shitty product in various companies and how much ARR did it generate? Uber seems to get no value out of it. | | |
| ▲ | aspenmartin 6 minutes ago | parent [-] | | Anthropic has generated far more than 5B in revenue, I don’t know what sort of computer you have but it evidently does have the Internet, I would recommend using that unless the Internet CEOs are also in trouble for hyping that one up. > Right, really sound arguments - writing is "clearly on the wall" and there are "so many data points". Thank you for recognizing this. Don’t read Ed and think you understand anything about AI is all I’ll say. Read epoch capability index paper and look at the dashboard chart or the METR time horizon chart and methodology and then return with what I imagine from historical comments will be another ferocious and impressive act of mental gymnastics. > I have to make predictions, but not confabulations, lies and idiocies. Idk you’ve been misquoting and aggressively against addressing any facts you are presented with and yet bring no facts of your own (hint: if you know what you’re talking about typically you can calmly discuss with actual facts). That feels pretty similar to confabulations, I won’t say idiocy I’m sure you are not an idiot but you seem to have a lot in common with your caricatures of tech CEOs. > FOR WHAT? Their product. |
|
|
|
|
|
| |
| ▲ | supern0va 2 hours ago | parent | prev | next [-] | | I work in big tech and probably 90% of code over the last month has been written by AI. And I suspect it's probably higher within Anthropic, which is probably what he's basing his opinion on. So, he's closer to correct than not. That said, your recollection is also flawed. It was in mid-March, and here's the relevant quotes: >I think we’ll be there in three to six months—where AI is writing 90 percent of the code. And then in twelve months, we may be in a world where AI is writing essentially all of the code. [...] >But the programmer still needs to specify, you know, what are—what are the conditions of what you’re doing, what—you know, what is the overall app you’re trying to make, what’s the overall design decision? How do we collaborate with other code that’s been written? You know, how do we have some common sense on whether this is a secure design or an insecure design? [...] >So as long as there are these small pieces that a programmer, a human programmer, needs to do, the AI isn’t good at, I think human productivity will actually be enhanced. But on the other hand, I think that eventually all those little islands will get picked off by AI systems. With another 3-4 months left on the clock, his prediction seems remarkably on point for at least certain organizations and domains. I welcome you to also hold yourself accountable in the coming months if this trend continues. ;) | | |
| ▲ | hansmayer an hour ago | parent | next [-] | | > I welcome you to also hold yourself accountable in the coming months if this trend continues. ;) My company did not swallow hundreds of billions in shady investment deals and is not publicly traded. We work with real money, and the revenue on our books is the revenue that is actually booked, not fake revenue we plan in 2 years time to maybe happen. So no, I am not going to hold myself accountable. But people who work with other people's money should be absolutely held accountable when their wild imaginations don't come true, repeatedly, quarter after quarter, year after year! | | |
| ▲ | aspenmartin an hour ago | parent | next [-] | | I think he means hold yourself accountable when it turns out your predictions and pessimism don't age well. | | |
| ▲ | hansmayer an hour ago | parent [-] | | Mate, for 5 years I've been hearing that crap. I am not predicting anything / on the contrary the AI boosting bunch is. When are your predictions coming true? | | |
| ▲ | supern0va an hour ago | parent | next [-] | | AFAIK, most predictions from several years ago were for...approximately now to within the next few years. Can you be more specific? You criticized a very specific (and fake/misquoted) prediction, ignored the correction, and are now criticizing vague hand-wavey "predictions" that you have left unspecified. Can you please stop with the angry/ranty replies and actually have a real conversation grounded in actual facts? Now, having said all of the above...I'll also point out that these are predictions, not promises/guarantees. These people are being asked to forecast and are doing so. I hardly think they should be held responsible for not being literal oracles, but even so--please, at least quote them correctly/at all. In short: be better than the hallucinations you're seen to call out from the models. | |
| ▲ | aspenmartin an hour ago | parent | prev [-] | | What predictions, sorry? |
|
| |
| ▲ | supern0va an hour ago | parent | prev [-] | | I will note that you have essentially not responded to anything specific in my comment, nor at least acknowledged that you misstated Dario Amodei's actual prediction. |
| |
| ▲ | pier25 an hour ago | parent | prev | next [-] | | > And I suspect it's probably higher within Anthropic That probably explains why their uptime and reliability are so bad. | |
| ▲ | m1coti an hour ago | parent | prev [-] | | Written, but was it reviewed? Do you need to edit code written by LLM? I agree that most of the things are written by AI but writting code was never the bottleneck in big tech. | | |
| ▲ | supern0va an hour ago | parent | next [-] | | Yep! We have a review process where we have a few agents, each tuned to a particular domain of expertise (security, code quality, etc) which iterate until the feedback meets a certain threshold, at which point it goes over to humans for (hopefully) final review. That said, I generally agree that you're correct: writing code in many ways has not been the biggest bottleneck. However, by removing much of that writing, it frees up engineers to work on the uniquely human things that are larger bottlenecks. I had a few comments in a thread here touching on where I think most of the value has come from for us (which is largely search/understanding of our dependencies and making away team work far more viable, which aids with cutting through bureaucracy and the tendency for teams to push back on work): https://news.ycombinator.com/item?id=48298731 | |
| ▲ | hansmayer an hour ago | parent | prev [-] | | Haven't you heard - these days they just throw slop generated by LLM agents over to other LLM agents which cosplay as internal QA. They know it works because they write really strict .MD files where they instruct agents in English language to 'never do this' and 'always do that'. | | |
| ▲ | aspenmartin 41 minutes ago | parent [-] | | This is really what you think happens at large tech companies? You don't think it's possible this is maybe even slightly overly simplifying what the relevant processes are? | | |
| ▲ | hansmayer 30 minutes ago | parent [-] | | Read the other comment in the thread. Your buddy literally confirmed exactly what I wrote. | | |
| ▲ | aspenmartin 19 minutes ago | parent [-] | | Comment does indicate you don’t really seek to know how things work with respect to this and seem to not be able to imagine that the Occam’s razor is: agents are more useful than you think they are. |
|
|
|
|
| |
| ▲ | sampli 2 hours ago | parent | prev [-] | | Elon playbook |
|