| |
| ▲ | philipwhiuk a day ago | parent | next [-] | | > Moore will continue to exponentially decrease costs over time as with all other workloads. There's absolutely no guarantee of this. The continuation of Moore's law is far from certain (NVIDIA think it's dead already). | | |
| ▲ | timschmidt a day ago | parent [-] | | > NVIDIA think it's dead already Perhaps that's what Jensen says publicly, but Nvidia's next generation chip contains more transistors than the last. And the one after that will too. Let me know when they align their $Trillions behind smaller less complex designs, then I'll believe that they think Moore's law is out of juice. Until then, they can sit with the group of people who've been vocally wrong about moore's law's end for the last 50 years. Our chips are still overwhelmingly 2D in design, just a few dozen layers thick but billions of transistors wide. We have quite a ways to go based on a first principles analysis alone. And indeed, that's what chip engineers like Jim Keller say: https://www.youtube.com/watch?v=c01BlUDIlK4 So ask yourself how it benefits Jensen to convince you otherwise. | | |
| ▲ | adgjlsfhk1 a day ago | parent [-] | | progress continues, but at far slower rates than they used to. nvidia has gained ~6x density in the past 9 years (1080 to 5090), while a doubling every 2 years would be >20x density in 9 years. the past 6 years (3090) are even worse with only a 3x of density | | |
| ▲ | timschmidt a day ago | parent [-] | | Moore's law says nothing about density. "The complexity for minimum component costs has increased at a rate of roughly a factor of two per year." Density is one way in which industry has met this observation over decades. New processes (NMOS, CMOS, etc) is another. New packaging techniques (flip chip, BGA, etc). New substrates. There's no limit to process innovation. Nvidia's also optimizing their designs for things other than minimum component cost. I.e. higher clock speeds, lower temperatures, lower power consumption, etc. It may seem like I'm picking a nit here, but such compromises are fundamental to the cost efficiency Moore was referencing. All data I've seen, once fully considered, indicates that Moore's law is healthy and thriving. |
|
|
| |
| ▲ | bilekas a day ago | parent | prev [-] | | > GPU compute in datacenters has been a thing for at least 20 years. Many of the top500 have included significant GPU clusters for that long. Of course they've been a thing, but for specialised situations, maybe rendering farms or backroom mining centers but it's disingenuous to claim that there's not an exponential growth in gpu useage. | | |
| ▲ | timschmidt a day ago | parent [-] | | Of course they've been a thing, but for specialized situations, maybe calculating trajectories or breaking codes but it's disingenuous to claim that there's not an exponential growth in digital computer usage. Jest aside, the use of digital computation has exploded exponentially, for sure. But alongside that explosion, fueled by it and fueling it reciprocally, the cost (in energy and dollars) of each computation has plummeted exponentially. | | |
| ▲ | bilekas a day ago | parent [-] | | I really would like more of your data to show that, I think it would put this discussion to rest actually because I keep seeing articles that dispute it. At least older ones that ring bells, specifically https://epoch.ai/blog/trends-in-the-dollar-training-cost-of-... | | |
| ▲ | timschmidt a day ago | parent [-] | | You can find plenty of jumping off points for research here: https://en.wikipedia.org/wiki/Performance_per_watt Along with this lovely graph captioned: "Exponential growth of supercomputer performance per watt based on data from the Green500 list." (note the log scale): https://en.wikipedia.org/wiki/Performance_per_watt#/media/Fi... From the section about GPU performance per watt, I'll quote: "With modern GPUs, energy usage is an important constraint on the maximum computational capabilities that can be achieved. GPU designs are usually highly scalable, allowing the manufacturer to put multiple chips on the same video card, or to use multiple video cards that work in parallel. Peak performance of any system is essentially limited by the amount of power it can draw and the amount of heat it can dissipate. Consequently, performance per watt of a GPU design translates directly into peak performance of a system that uses that design." |
|
|
|
|