| ▲ | chollida1 7 days ago |
| Most of this might be true for LLM's but years of investing experience has created a mental model of looking for the tech or company that sucks and yet keeps growing. People complained endlessly about the internet in the early to mid 90s, its slow, static, most sites had under construction signs on them, your phone modem would just randomly disconnect. The internet did suck in alot of ways and yet people kept using it. Twitter sucked in the mid 2000s, we saw the fail whale weekly and yet people continued to use it for breaking news. Electric cars sucked, no charging, low distance, expensive and yet no matter how much people complain about them they kept getting better. Phones sucked, pre 3G was slow, there wasn't much you could use them for before app stores and the cameras were potato quality and yet people kept using them while they improved. Always look for the technology that sucks and yet people keep using it because it provides value. LLM's aren't great at alot of tasks and yet no matter how much people complain about them, they keep getting used and keep improving through constant iteration. LLM"s amy not be able to build software today, but they are 10x better than where they were in 2022 when we first started using chatgpt. Its pretty reasonable to assume in 5 years they will be able to do these types of development tasks. |
|
| ▲ | freehorse 7 days ago | parent | next [-] |
| At the same time, there have been expectations about many of these that did not meet reality at any point. Much of this is due to physical limitations that are not trivial to be overcome. Internet gets faster and more stable, but the metaverse taking over did not happen partially because many people still get nausea after a bit and no 10x scaling fixed that. A lot of what you described as "sucked" were not seen as "sucking" at the time. Nobody complained about the phones being slow because nobody expected to use phones the way we do today. The internet was slow and less stable but nobody complained because they expected to stream 4k movies and they could not. This is anachronistic. The fact that we can see how some things improved in X Y manner does not mean that LLMs will improve the way you think they will. Maybe we invent a different technology that does a better job. After it was not that dial up itself became faster and I don't think there were fanatics saying that dialup technology would give us 1Gbp speeds. The problem with AI is that because scaling up compute has provided breakthroughs, some think that somehow with scaling up compute and some technical tricks we can solve all the current problems. I don't think that anybody can say that we cannot invent a technology that can overcome these, but if LLMs is this technology that can just keep scaling has been under doubt. Last year or so there has been a lot of refinement and broadening of applications, but nothing like a breakthrough. |
| |
| ▲ | andreasmetsala 7 days ago | parent [-] | | > but the metaverse taking over did not happen partially because many people still get nausea after a bit and no 10x scaling fixed that. Has VR really improved 10x? I lost touch after the HTC Vive and heard about Valve Index but I was under the impression that even the best that Apple has on offer is 2x at most. | | |
| ▲ | jdiff 7 days ago | parent | next [-] | | I think you're reading a little far into it, the number 10x was used prior so it was used there in demonstrating that there are some problems that scaling can't fix, it's not a statement on how far VR has come or not. | |
| ▲ | 6 days ago | parent | prev [-] | | [deleted] |
|
|
|
| ▲ | runako 7 days ago | parent | prev | next [-] |
| > Phones sucked, pre 3G was slow, there wasn't much you could use them for before app stores and the cameras were potato quality This is a big rewrite of history. Phones took off because before mobile phones the only way to reach a person was to call when they were at home or their office. People were unreachable for timespans that now seem quaint. Texting brought this into async. The "potato" cameras were the advent of people always having a camera with them. People using the Nokia 3210 were very much not anticipating when their phones would get good, they were already a killer app. That they improved was icing on the cake. |
| |
| ▲ | ARandumGuy 6 days ago | parent [-] | | > People using the Nokia 3210 were very much not anticipating when their phones would get good, they were already a killer app. That they improved was icing on the cake. It always bugs me whenever I hear someone defend some new tech (blockchain, LLMs, NFTs) by comparing it with phones or the internet or whatever. People did not need to be convinced to use cell phones or the internet. While there were absolutely some naysayers, the utility and usefulness of these technologies was very obvious by the time they became available to consumers. But also, there's survivorship bias at play here. There are countless promising technologies that never saw widespread adoption. And any given new technology is far more likely to end up as a failure then it is to become "the next iPhone" or "the new internet." In short, you should sell your technology based on what it can do right now, instead of what it might do in the future. If your tech doesn't provide utility right now, then it should be developed for longer before you start charging money for it. And while there's certainly some use for LLMs, a lot of the current use cases being pushed (google "AI overviews", shitty AI art, AIs writing out emails) aren't particularly useful. | | |
| ▲ | fragmede 6 days ago | parent | next [-] | | The technology to look at is shopping carts. They're obvious to us now, but when they were first introduced, stores hired actors to use them so that real customers would adopt the habit. There are various "killer" apps that are already currently very useful for their users, but they'll take a while to percolate out as people discover them. That you don't agree with what the corpos are pushing is their bad. | | |
| ▲ | ARandumGuy 6 days ago | parent | next [-] | | But that's just more cherry-picking. You can always find some past success to push whatever point you're trying to make. But just because shopping carts were a huge hit doesn't mean that whatever you're trying to push will be. For example, it would be wrong for me to say that "hyperloop got a ton of hype and investments, and it failed. Therefore LLMs, which are also getting a ton of hype and investments, will also fail." Hyperloop and LLMs are fundamentally different technologies, and the failure of hyperloop is a poor indicator of whether LLMs will ultimately succeed. Which isn't to say we can't make comparisons to previous successes or failures. But those comparisons shouldn't be your main argument for the viability of a new technology. | |
| ▲ | komali2 6 days ago | parent | prev [-] | | People used to fill their bags with produce, bundles or bags of fish and meat, and here and there a couple bags or boxes of dry goods. Carts were a necessity to get people to interact with the new "center aisles" of the grocery store which is mostly full of boxed and canned garbage. |
| |
| ▲ | raincole 6 days ago | parent | prev [-] | | > People did not need to be convinced to use cell phones or the internet. Plenty of people don't need to be convinced to use LLM either... |
|
|
|
| ▲ | sidewndr46 6 days ago | parent | prev | next [-] |
| As others have mentioned you are just writing your own history to suit your narrative. There is no evidence to support "People complained endlessly about the internet in the early to mid 90s,". In the early and 1990s, people effectively did not use the internet. Usage was tiny and miniscule, limited to only tiny niche groups. People heard about the internet via the 90 second blurb on the evening new show. It wasn't until sometime after the launch of Facebook that the internet was even mainstream. So I really don't think people complained about the internet being slow that they weren't using. I can go on here, but I don't really need to spend paragraphs refuting something that is obviously false. |
| |
| ▲ | area51org 6 days ago | parent | next [-] | | Having lived in that era: no one "complained endlessly", or even at all, about the internet. It was seen as magical. When compared to not existing at all, being slow wasn't all that awful. | | |
| ▲ | skydhash 6 days ago | parent | next [-] | | I remember using the internet around 2005 and you could hold a conversation while waiting for the page to load. No one complains, because you have a wealth of information at your fingertips. It was actually amazing to chat with someone anywhere in the world or to be able to browse some forums. | |
| ▲ | nyonyo 5 days ago | parent | prev [-] | | >When compared to not existing at all, being slow wasn't all that awful. slow is relative to the use, anyway I remember the first time I saw the real time chat function of ICQ, where people could see you typing with not that much delay, I was utterly fascinated that such a thing was even happening normal web pages not filled with animated gifs were not unbearably slow either "slow" is what happened if you tried to use Real Player and saw that dreaded "buffering" every 5 seconds of video |
| |
| ▲ | 6 days ago | parent | prev | next [-] | | [deleted] | |
| ▲ | gyomu 6 days ago | parent | prev [-] | | > you are just writing your own history to suit your narrative Classic LLM behavior |
|
|
| ▲ | bunderbunder 7 days ago | parent | prev | next [-] |
| This is such selective hindsight, though. We remember the small minority of products that persisted and got better. We don't remember the majority of ones that fizzled out after the novelty wore off, or that ultimately plateaued. Me, I agree with the author of the article. It's possible that the technology will eventually get there, but it doesn't seem to be there now. And I prefer to make decisions based on present-day reality instead of just assuming that the future I want is the future I'll get. |
| |
| ▲ | chollida1 7 days ago | parent [-] | | > This is such selective hindsight, though. Ha;) Yes, when you provide examples to prove your point they are, by definition, selective:) You are free to develop your own mental models of what technology and companies to invest in. I was only trying to share my 20 years of experience with investing to show why you shouldn't discard current technology because of its current limits. | | |
| ▲ | bunderbunder 7 days ago | parent [-] | | Fair, but also, investing is kind of its own thing because it's inherently trying to predict the future based on partial information today. Engineering decisions, which is closer to what TFA is talking about, tend to have to be a lot more focused on the here & now. You can make bets on future R&D developments (e.g, the Apollo program), but that's a game best played when you also have some control over R&D budgeting and direction (e.g, the Apollo program), and when you don't have much other choice (e.g, the Apollo program). |
|
|
|
| ▲ | overgard 7 days ago | parent | prev | next [-] |
| I'm not a fan of the argument that LLMs have gotten X times better in the past few years, so thusly they will continue to get X times better in the next few years. From what I can see, all the growth has mostly come from optimizing a few techniques, but I'm not convinced that we aren't going to get stuck in a local maxima (actually, I think that's the most likely outcome). Specifically, to me the limitation of LLMs is discovering new knowledge and being able to reason about information they haven't seen before. LLMs still fail at things like counting the number of b's in the word blueberry or not getting distracted by inserting random cat facts in word problems (both issues I've seen appear in the last month) I don't mean that to say they're a useless tool, I'm just not into the breathless hype. |
| |
| ▲ | AstroBen 6 days ago | parent [-] | | Relevant: https://xkcd.com/605/ The latest releases are seeing smaller and smaller improvements, if any. Unless someone can explain the technical reasons why they're likely to scale to being able to do X then it's a pretty useless claim |
|
|
| ▲ | masterj 7 days ago | parent | prev | next [-] |
| > LLM"s amy not be able to build software today, but they are 10x better than where they were in 2022 when we first started using chatgpt. Its pretty reasonable to assume in 5 years they will be able to do these types of development tasks. We can expect them to be better in 5 years, but your last assertion doesn't follow. We can't assert with any certainty that they will be able to specifically solve the problems laid out in the article. It might just not be a thing LLMs are good at, and we'll need new breakthroughs that may or may not appear. |
|
| ▲ | mbesto 7 days ago | parent | prev | next [-] |
| We also thought 3D printing would print us a car, but alas. FWIW - 3d printing has come a far way, and I personally have a 3D printer. But the idea that it was going to completely disrupt manufacturing is simply not true. There are known limitations (how the heck are you going to get a wood polymer squeezed through a metal tip?) and those limitations are physics, not technical ones. |
| |
| ▲ | chollida1 6 days ago | parent | next [-] | | Agreed on 3D printing but that is a technology that would have failed my screening as proposed. They haven't continued to see massive adoption and improvement despite the flaws people point out. They had initial success at printing basic plastic pieces but have failed to print in other materials like metal as you correctly point out, so these wouldn't pass my screening as they currently sit. | |
| ▲ | fragmede 6 days ago | parent | prev [-] | | The fact that I needed a bag clip and just have to search on an app on my phone for one and hit print, mostly trouble-free, says that it's here. Sure, spending $1500 to save $3 isn't economically optimal, but 3d printing has disrupted things. Just look at the SpaceX rocket engines. |
|
|
| ▲ | fmbb 6 days ago | parent | prev | next [-] |
| People also complained a lot about VR. And NFTs had a lot of loud detractors. And everyone complained about a million other solutions that did not go anywhere. Still, a bunch of investors made a lot of money on VR and very much so on NFT. Investments being good is not an indicator of anything being useful. |
| |
| ▲ | danielbln 6 days ago | parent [-] | | I use LLMs every single day, for hours. Iw as suuuuuuper into VR in early-mid 2010s but even that didn't see that much adoption among my peers, whereas everyone is using LLMs. And NFTs was always perceived as a scam, same as the breathless blockchain no sense. LLMs have many many issues, but I think they stick out as different to the other examples. |
|
|
| ▲ | jarjoura 6 days ago | parent | prev | next [-] |
| I see a bit of distinction here, that the foundation models aren't actually 10x better than in 2022. What's improved though is that we have far more domain knowledge of how to get more out of slightly improved models. So consider your analogy, that the internet was always useful, but it was javascript that caused the actual titanic shift in the software industry. Even though the core internet backbone didn't radically improve as fast as you imagine it would have. Javascript was hacked together as a toy scripting language meant to make pages more interactive, but turns out, it was the key piece in unlocking that 10x value of the already existing internet. Agents and the explosion of all these little context services are where I see the same thing happening here. Right now they are buggy, and mostly experimental toys. However, they are unlocking that 10x value. |
| |
| ▲ | skydhash 6 days ago | parent [-] | | > Javascript was hacked together as a toy scripting language meant to make pages more interactive, but turns out, it was the key piece in unlocking that 10x value of the already existing internet. Was it? I remember a lot more installable software than you do being the core usage of computers. Even today, most people are using apps. |
|
|
| ▲ | ausbah 7 days ago | parent | prev | next [-] |
| those are really good points, but LLMs have really started to plateau off on their capabilities haven’t they? the improvements from gpt2 class models to 3 was much bigger then 3 to 4, which was only somewhat bigger than 4 to 5 most of the vibe shift I think I’ve seen in the past few months to using LLMs in the context of coding has been improvements in dataset curation and ux, not fundamentally better tech |
| |
| ▲ | worldsayshi 7 days ago | parent | next [-] | | > LLMs have really started to plateau That doesn't seem unexpected. Any technological leap seem to happen in sigmoid-like steps. When a fruitful approach is discovered we run to it until diminishing returns sets in. Often enough a new approach opens doors to other approaches that builds on it. It takes time to discover the next step in the chain but when we do we get a new sigmoid-like leap. Etc... | | |
| ▲ | worldsayshi 7 days ago | parent [-] | | Personally my bet for the next fruitful step is something in line with what Victor Taelin [1] is trying to achieve. I.e. combining new approaches around old school "AI" with GenAI. That's probably not exactly what he's trying to do but maybe somewhere in the ball park. 1 - https://x.com/victortaelin |
| |
| ▲ | bigstrat2003 7 days ago | parent | prev | next [-] | | Started? In my opinion they haven't gotten better since the release of ChatGPT a few years ago. The weaknesses are still just as bad, the strengths have not improved. Which is why I disagree with the hype saying they'll get better still. They don't do the things they are claimed to today, and haven't gotten better in the last few years. Why would I believe that they'll achieve even higher goals in the future? | | |
| ▲ | Closi 6 days ago | parent [-] | | I assume you don’t use these models frequently, because there is a staggering difference in response quality from frontier LLMs compared to GPT 3. Go open the OpenAI API playground and give GPT3 and GPT5 the same prompt to make a reasonably basic game in JavaScript to your specification and watch GPT 3 struggle and GPT 5 one-shot it. | | |
| ▲ | globular-toast 6 days ago | parent | next [-] | | Sure but it's kinda like a road then never quite gets you anywhere. It seems to get closer and closer to the next town all the time, but ultimately it's still not there yet and that's all that really matters. | |
| ▲ | chrz 6 days ago | parent | prev [-] | | Theyre faster, shinier, get lost less but still doesnt fly |
|
| |
| ▲ | DanielHB 7 days ago | parent | prev | next [-] | | All the other things he mentioned didn't rely on breakthroughs, LLMs really do seem to have reached a plateau and need a breakthrough to push along to the next step. Thing is breakthroughs are always X years away (50 for fusion power for example). The only example he gave that actually was kind of a big deal was mobile phones where capacitive touchscreens really did catapult the technology forward. But it is not like celphones weren't already super useful, profitable and getting better over time before capacitive touchscreens were introduced. Maybe broadband to the internet also qualifies. | | |
| ▲ | Closi 6 days ago | parent [-] | | > All the other things he mentioned didn't rely on breakthroughs, LLMs really do seem to have reached a plateau and need a breakthrough to push along to the next step. I think a lot of them relied on gradual improvement and lots of 'mini-breakthroughs' rather than one single breakthrough that changes everything. These mini-breakthroughs took decades to realise themselves properly in almost every example on the list too, not just a couple of years. My personal gut feel is that even if the core technology plateau's, there's still lots of iterative improvement to go after on the productisation/commercialisation of the existing technology (e.g. improving tooling/ui/applying it to solving real problems/productising current research etc). In electric car terms - we are still at the stage where Tesla is shoving batteries in a lotus elise, rather than releasing the model 3. We might have the lithium polymer batteries, but there's still lots of work to do to pull it into the final product. (Having said this - I don't think the technology has plateau'd - I think we are just looking at it across a very narrow time span. If in 1979 you said that computers had plateau'd in 1979 because there hadn't been much progress in the last 12 months they would have been very wrong - breakthrough's sometimes take longer as technology matures, but that doesn't mean that the technology two decades from now won't be substantially different. |
| |
| ▲ | imtringued 6 days ago | parent | prev | next [-] | | There also is an absolutely massive gap between Llama 2 and Llama 3. The Llama 3.1 models represent the beginning of usable open weight models. Meanwhile Llama 4 and its competitors seem to be incremental improvements. Yes, the newest models are so much better that they obsolete the old ones, but now the biggest differences between models is primarily what they know (parameter count and dataset quality) and how much they spend thinking (compute budget). | |
| ▲ | stpedgwdgfhgdd 7 days ago | parent | prev | next [-] | | There is a big difference between Claude Code today and 6 months ago. Perhaps the LLMs plateau, but the tooling not. | |
| ▲ | NitpickLawyer 7 days ago | parent | prev | next [-] | | > but LLMs have really started to plateau off on their capabilities haven’t they? Uhhh, no? In the past month we've had: - LLMs (3 different models) getting gold at IMO - gold at IoI - beat 9/10 human developers at atcode heuristics (optimisations problems) with the single human that actually beat the machine saying he was exhausted and next year it'll probably be over. - agentic that actually works. And works for 30-90 minute sessions while staying coherent and actually finishing tasks. - 4-6x reduction in price for top tier (SotA?) models. oAI's "best" model now costs 10$/MTok, while retaining 90+% of their previous SotA models that were 40-60$/MTok. - several "harnesses" being released by every model provider. Claude code seems to remain the best, but alternatives are popping off everywhere - geminicli, opencoder, qwencli (forked, but still), etc. - opensource models that are getting close to SotA, again. Being 6-12months behind (depending on who you ask), opensource and cheap to run (~2$/MTok on some providers). I don't see the plateauing in capabilities. LLMs are plateauing only in benchmarks, where number goes up can only go up so far until it becomes useless. IMO regular benchmarks have become useless. MMLU & co are cute, but agentic whatever is what matters. And those capabilities have only improved. And will continue to improve, with better data, better signals, better training recipes. Why do you think eveyr model provider is heavily subsidising coding right now? They all want that sweet sweet data & signals, so they can improve their models. | | |
| ▲ | tripzilch 3 days ago | parent [-] | | > I don't see the plateauing in capabilities. LLMs are plateauing only in benchmarks Don't you mean the opposite? Like, it beat an IMO, which is a benchmark, but it's nowhere remotely close to having any of even the basic mathematical capabilities someone who beat an IMO can be expected to have. Like being unable to deal with negations ... or not getting confused by a question being stated in something other than their native alphabet ... |
| |
| ▲ | cameronh90 7 days ago | parent | prev [-] | | I'm not sure I'd describe it as a plateau. It might be, but I'm not convinced. Improvements are definitely not as immediately obvious now, but how much of that is due to it being more difficult to accurately gauge intelligence above a certain point? Or even that the marginal real life utility of intelligence _itself_ starts to plateau? A (bad) analogy would be that I can pretty easily tell the difference between a cat and an ape, and the differences in capability are blatantly obvious - but the improvement when going from IQ 70 to Einstein are much harder to assess and arguably not that useful for most tasks. I tend to find that when I switch to a new model, it doesn't seem any better, but then at some point after using it for a few weeks I'll try to use the older model again and be quite surprised at how much worse it is. |
|
|
| ▲ | einrealist 6 days ago | parent | prev | next [-] |
| > Twitter sucked [...] Electric cars sucked [...] Phones sucked All these things are not black boxes and they are mostly deterministic. Based on the inputs, you EXACTLY know what to expect as output. That's not the case with LLMs, how they are trained and how they work internally. We certainly get a better understanding on how to adjust the inputs so we get a desired output. But that's far from guaranteed at the same level as the examples you mentioned. That's a fundamental problem with LLMs. And you can see that in how industry actors are building solutions around that problem. Reasoning (chain-of-thought) is basically a band-aid to narrow a decision tree, because the LLM does not really "reason" about anything. And the results only get better with more training data. We literally have to brute-force useful results by throwing more compute and memory at the problem (and destroying the environment and climate by doing so). The stagnation of recent model releases does not look good for this technology. |
|
| ▲ | mrheosuper 6 days ago | parent | prev | next [-] |
| > Electric cars sucked, no charging, low distance, expensive and yet no matter how much people complain about them they kept getting better. It takes them over a century to get to this current point. |
| |
| ▲ | jansper39 6 days ago | parent [-] | | They've not been in active development for that time though, only really the last 12 years. |
|
|
| ▲ | isoprophlex 7 days ago | parent | prev | next [-] |
| Now think about hoverboards, self-cleaning shirts, moon bases, flying cars, functioning democracies, whatever VR tech is described in snow crash as well. Where on the spectrum will LLMs fall? |
|
| ▲ | 4b11b4 7 days ago | parent | prev | next [-] |
| "it's pretty reasonable".. big jump? |
|
| ▲ | 6 days ago | parent | prev [-] |
| [deleted] |