Is CPU design hitting a (soft) speed limit?

	▲	Is CPU design hitting a (soft) speed limit?
		3 points by leecommamichael 10 hours ago \| 4 comments
		I'm not in the hardware industry, and am curious what the future might hold. We've been stuck in the 3-5 Ghz range for a long time. I think we're scaling by adding cache, cores, and specialized instructions. Due to observations like Amdahl's Law, PC hobbyists aren't seeing great returns on new machines. We've had 8+ logical cores for a long time now. The physical size of gates in the CPU is getting very difficult and costly for companies to shrink further. We are approaching a scale where more quantum effects are at play, and there is actually a "minimum scale" for transistors. So where can we go from here? What will the effects of these changes (or lack thereof) be on software development?
	▲	Flundstrom2 an hour ago \| parent \| next [-]
		The GHz isn't the main issue; it is latency. We were stuck at 3-3.5 GHz for a long while with dual core being super expensive. Now we're at 6-6.2 GHz in turbo mode. Can we go faster?Likely, but I don't see 12 GHz within forseable future. Instead, internal parallelism within a core have increased significantly. The bottleneck is cache reload and flushing from/to RAM. Highly optimized programs /may/ fit in cache (assuming they don't get swapped out due to task switching) but most don't since they use large frameworks and/or are interpreted or runs on Windows that since dawn of time always take 2-3% CPU. But there's a lot of programs that are unable to utilize multiple cores, so even if there's 20 cores available, 19 are unused and the last one is limited. Lower voltage allows for somewhat faster clock speeds, allowing faster execution. Microcode based instruction sets (x64) benefit the most from internal parallelism due to the ability to begin an instruction before the previous has completed, while ARM CPUs generally split the entire instructions into independent parts that all are executed simultaneously, but there's a limit on how much that can be parallelized given a certain RAM bus width. Increasing RAM bus width, expecially external RAM bus width is likely where the biggest gain is.
	▲	dlcarrier 7 hours ago \| parent \| prev \| next [-]
		We hit that wall 20 years ago. That's why we went parallel. It's not just extra cores that are more parallel, extra instructions get their performance from parallelism, too. They allow an operation to be simultaneously performed on every entry in an array much quicker than would happen with a loop performing the instructions one at a time. Caches also not only get bigger, but also wider, so a single fetch brings in more data. The physics of the gates isn't the limiting factor. They only work because of quantum effects, so they keep working as they shrink, although the voltage does need to decrease to limit tunneling, but it also gets to decrease because of the thinner oxide layer and lower capacitance, and that allows for increased speed and reduces power consumption. The limiting factor is manufacturing at that small of scale, with light wavelengths being too large for etching features small enough to shrink transistors further. ASML is the only company that makes current-generation extreme ultraviolet manufacturing equipment, and researchers are looking into using X-rays or mechanical stamping, for future generations. On top of more parallelism, recent gains have come from better packaging, with 3D stacking shortening the distance signals need to travel, allowing busses to operate faster. For example, AMD's X3D processors couldn't have as much cache as they have if it were manufactured on the same die as the processor, unless they slowed it down, which would be counterproductive. It does limit heat dissipation, but it also reduces heat generation, because the busses don't need to be driven as hard, so it's still worthwhile. We've also made a lot of gains from better transistor geometry, with increasing ratios of gates to channels, creating a larger difference in resistance between on and off states. If we can keep shrinking manufacturing processes, even to the point that we design to the nearest atom, we'll be able to use that precision to keep creating better geometry, even though the transistors aren't meaningfully smaller. One thing is for certain though, even though current processors only work because of quantum effects, when AI is no longer the peak buzzword, quantum will come back into vogue as the snazziest buzzword, and future generations of processors that are just a regular evolution of current technology will be marketed as quantum processors, despite not using qbits or quantum annealing.
	▲	wmf 9 hours ago \| parent \| prev \| next [-]
		It's slowing down but Apple is still improving 10-15% per year. Zen 6 is rumored to hit 6.6 GHz later this year. Apple and Intel still have room to steal V-Cache. What will the effects of these changes (or lack thereof) be on software development? Probably a Python interpreter running on WASM inside Electron in a VM.
	▲	re-thc 9 hours ago \| parent \| prev [-]
		> We've been stuck in the 3-5 Ghz range for a long time. I think we're scaling by adding cache, cores, and specialized instructions. Well, whatever works. Does it matter if it is Ghz or something else? > Due to observations like Amdahl's Law, PC hobbyists aren't seeing great returns on new machines. We've had 8+ logical cores for a long time now. If you look at Ryzen gains per generation I'm not sure how you can say this? Maybe less lately due to AI bloating the prices, attention shifting to GPUs etc, but CPUs have definitely been growing. > So where can we go from here? What will the effects of these changes (or lack thereof) be on software development? Software development has introduced so much bloat there's infinite room to grow before it is a hardware issue.