▲ | furyofantares 4 days ago | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
I have a theory about why it's so easy to underestimate long-term progress and overestimate short-term progress. Before a technology hits a threshold of "becoming useful", it may have a long history of progress behind it. But that progress is only visible and felt to researchers. In practical terms, there is no progress being made as long as the thing is going from not-useful to still not-useful. So then it goes from not-useful to useful-but-bad and it's instantaneous progress. Then as more applications cross the threshold, and as they go from useful-but-bad to useful-but-OK, progress all feels very fast. Even if it's the same speed as before. So we overestimate short term progress because we overestimate how fast things are moving when they cross these thresholds. But then as fewer applications cross the threshold, and as things go from OK-to-decent instead of bad-to-OK, that progress feels a bit slowed. And again, it might not be any different in reality, but that's how it feels. So then we underestimate long-term progress because we've extrapolated a slowdown that might not really exist. I think it's also why we see a divide where there's lots of people here who are way overhyped on this stuff, and also lots of people here who think it's all totally useless. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | svantana 4 days ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
> why it's so easy to underestimate long-term progress and overestimate short-term progress I dunno, I think that's mostly post-hoc rationalization. There are equally many cases where long-term progress has been overestimated after some early breakthroughs: think space travel after the moon landing, supersonic flight after the concorde, fusion energy after the H-bomb, and AI after the ENIAC. Turing himself guesstimated that human-level AI would arrive in the year 2000. The only constant is that the further into the future you go, the harder it is to predict. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | strken 4 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
I think that for a lot of examples, the differentiating factor is infrastructure rather than science. The current wave of AI needed fast, efficient computing power in massive data centres powered by a large electricity grid. The textiles industry in England needed coal mining, international shipping, tree trunks from the Baltic region, cordage from Manilla, and enclosure plus the associated legal change plus a bunch of displaced and desperate peasantry. Mobile phones took portable radio transmitters, miniaturised electronics, free space on the spectrum, population density high enough to make a network of towers economically viable, the internet backbone and power grid to connect those towers to, and economies of scale provided by a global shipping industry. Long term progress seems to often be a dance where a boom in infrastructure unlocks new scientific inquiry, then science progresses to the point where it enables new infrastructure, then the growth of that new infrastructure unlocks new science, and repeat. There's also lag time based on bringing new researchers into a field and throwing greater funding into more labs, where the infrastructure is R&D itself. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | xbmcuser 4 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
There is also an adoption curve. The people that grew up without it wont use it as much as children that grew up with it and knowing how to use it. My sister is an admin in a private school (Not in USA) and the owner of the school is someone willing to adopt new tech very quickly. So he got all the school admin subscriptions for chatgpt. At the time my sister used to complain a lot about being over worked and having to bring work home everyday. 2 years later my sister uses it for almost everything and despite her duties increasing she says she gets a lot more done rarely has to bring work home. And in the past they had an English major specially to go over all correspondences to make sure there were no grammatical or language mistakes that person was assigned a different role as she was no longer needed. I think as newer generations used to using LLM for things start getting into the work force and higher roles the real effect of LLM will be felt more broadly as currently apart from early adopters the number of people that use LLM for all the things that they can be used for is still not that high. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | hirako2000 4 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
GPT3 is when the mass started to get exposed to this tech, it felt like a revolution. Got 3.5 felt like things were improving super super fast and created that feeling the near feature will be unbelievable. Got to 4/o series, it felt things had improved but users weren't as thrilled as with the leap to 3.5 You can call that bias, but clearly version 5 improvements displays an even greater slow down, that's 2 long years since gp4. For context: - gpt 3 got out in 2020 - gpt 3.5 in 2022 - gpt 4 in 2023 - gpt 4o and clique, 2024 After 3.5 things slowed down, in term of impact at least. Larger context window, multi-modality, mixture of experts, and more efficienc: all great, significant features, but all pale compared to the impact made by RLHF already 4 years ago. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | vczf 4 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
The more general pattern is “slowly at first, then all at once.” It almost universally describes complex systems. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | heywoods 4 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Your threshold theory is basically Amara's Law with better psychological scaffolding. Roy Amara nailed the what ("we tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run") [1] but you're articulating the why better than most academic treatments. The invisible-to-researchers phase followed by the sudden usefulness cascade is exactly how these transitions feel from the inside. This reminds me of the CPU wars circa 2003-2005. Intel spent years squeezing marginal gains out of Pentium 4's NetBurst architecture, each increment more desperate than the last. From 2003 to 2005, Intel shifted development away from NetBurst to focus on the cooler-running Pentium M microarchitecture [2]. The whole industry was convinced we'd hit a fundamental wall. Then boom, Intel released dual-core processors under the Pentium D brand in May 2005 [2] and suddenly we're living in a different computational universe. But teh multi-core transition wasn't sudden at all. IBM shipped the POWER4 in 2001, the first non-embedded microprocessor with two cores on a single die [3]. Sun had been preaching parallelism since the 90s. It was only "sudden" to those of us who weren't paying attention to the right signals. Which brings us to the $7 trillion question: where exactly are we on the transformer S-curve? Are we approaching what Richard Foster calls the "performance plateau" in "Innovation: The Attacker's Advantage" [4], where each new model delivers diminishing returns? Or are we still in that deceptive middle phase where progress feels linear but is actually exponential? The pattern-matching pessimist in me sees all the classic late-stage S-curve symptoms. The shift from breakthrough capabilities to benchmark gaming. The pivot from "holy shit it can write poetry" to "GPT-4.5-turbo-ultra is 3% better on MMLU." The telltale sign of technological maturity: when the marketing department works harder than the R&D team. But the timeline compression with AI is unprecedented. What took CPUs 30 years to cycle through, transformers have done in 5. Maybe software cycles are inherently faster than hardware. Or maybe we've just gotten better at S-curve jumping (OpenAI and Anthropic aren't waiting for the current curve to flatten before exploring the next paradigm). As for whether capital can override S-curve dynamics... Christ, one can dream.. IBM torched approximately $5 billion on Watson Health acquisitions alone (Truven, Phytel, Explorys, Merge) [5]. Google poured resources into Google+ before shutting it down in April 2019 due to low usage and security issues [6]. The sailing ship effect (coined by W.H. Ward in 1967, where new technology accelerates innovation in incumbent technology)[7] si real, but you can't venture-capital your way past physics. I think we can predict all this capital pouring in to AI might actually accelerate S-curve maturation rather than extend it. All that GPU capacity, all those researchers, all that parallel experimentation? We're speedrunning the entire innovation cycle, which means we might hit the plateau faster too. You're spot on about the perception divide imo. The overhyped folks are still living in 2022's "holy shit ChatGPT" moment, while the skeptics have fast-forwarded to 2025's "is that all there is?" Both groups are right, just operating on different timescales. It's Schrödinger's S-curve where we things feel simultaneously revolutionary and disappointing, depending on which part of the elephant you're touching. The real question I have is whether we're approaching the limits of the current S-curve (we probably are), but whether there's another curve waiting in the wings. I'm not a researcher in this space nor do I follow the AI research beat to weigh in but hopefully someone in the thread can? With CPUs, we knew dual-core was coming because the single-core wall was obvious. With transformers, the next paradigm is anyone's guess. And that uncertainty, more than any technical limitation, might be what makes this moment feel so damn weird. References: [1] "Amara's Law" https://en.wikipedia.org/wiki/Roy_Amara [2] "Pentium 4" https://en.wikipedia.org/wiki/Pentium_4 [3] "POWER4" https://en.wikipedia.org/wiki/POWER4 [4] Innovation: The Attacker's Advantage - https://annas-archive.org/md5/3f97655a56ed893624b22ae3094116... [5] IBM Watson Slate piece - https://slate.com/technology/2022/01/ibm-watson-health-failu... [6] "Expediting changes to Google+" - https://blog.google/technology/safety-security/expediting-ch... [7] "Sailing ship effect" https://en.wikipedia.org/wiki/Sailing_ship_effect. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|