Remix.run Logo
bob1029 10 hours ago

A fun experiment but I wonder how many out there seriously think we could ever completely rid ourselves of the CPU. It seems to be a rising sentiment.

The cost of communicating information through space is dealt with in fundamentally different ways here. On the CPU it is addressed directly. The actual latency is minimized as much as possible, usually by predicting the future in various ways and keeping the spatial extent of each device (core complex) as small as possible. The GPU hides latency with massive parallelism. That's why we can put them across relatively slow networks and still see excellent performance.

Latency hiding cannot deal well in workloads that are branchy and serialized because you can only have one logical thread throughout. The CPU dominates this area because it doesn't cheat. It directly targets the objective. Making efficient, accurate control flow decisions tends to be more valuable than being able to process data in large volumes. It just happens that there are a few exceptions to this rule that are incredibly popular.

st_goliath 6 hours ago | parent | next [-]

> I wonder how many out there seriously think we could ever completely rid ourselves of the CPU. It seems to be a rising sentiment.

This sentiment is not a recent thing. Ever since GPGPU became a thing, there have been people who first hear about it, don't understand processor architectures and get excited about GPUs magically making everything faster.

I vividly recall a discussion with some management type back in 2011, who was gushing about getting PHP to run on the new Nvidia Teslas, how amazingly fast websites will be!

Similar discussions also spring up around FPGAs again and again.

The more recent change in sentiment is a different one: the "graphics" origin of GPUs seem to have been lost to history. I have met people (plural) in recent years who thought (surprisingly long into the conversation) that I mean stable diffusion when talking about rendering pictures on a GPU.

Nowadays, the 'G' in GPU probably stands for GPGPU.

ecshafer 5 hours ago | parent [-]

The dream I think has always been heterogeneous computing. The closest here I think is probably apple with their multi-core cpus with different cores, and a gpu with unified memory. (someone with more knowledge of computer architecture could probably correct me here).

Have a CPU, GPU, FPGA, and other specific chips like Neural chips. All there with unified memory and somehow pipelining specific work loads to each chip optimally to be optimal.

I wasn't really aware people thought we would be running websites on GPUs.

an hour ago | parent | prev | next [-]
[deleted]
volemo 10 hours ago | parent | prev | next [-]

I see us not getting rid of CPU, but CPU and GPU being eventually consolidated in one system of heterogeneous computing units.

nine_k 8 hours ago | parent | next [-]

CPU and GPU have very different ways of scheduling instructions, requiring somehow different interfaces and programming models.. I'd hazard to say that a GPU and CPU with unified memory access (like the Apple's M series, and most mobile chips) is already such a consolidated system.

amelius 7 hours ago | parent [-]

nVidia Jetson also has unified memory access btw.

junon 5 hours ago | parent | prev | next [-]

We're getting there already with e.g. Grace-Blackwell chips.

jagged-chisel 9 hours ago | parent | prev [-]

Agreed. Much like “RISC is gonna replace everything” - it didn’t. Because the CPU makers incorporated lessons from RISC into their designs.

I can see the same happening to the CPU. It will just take on the appropriate functionality to keep all the compute in the same chip.

It’s gonna take awhile because Nvidia et al like their moats.

StilesCrisis 6 hours ago | parent | next [-]

CISC only survived because CPUs now dedicate a ton of silicon to decoding the CISC stream into RISC-y microcode. RISC CPUs can avoid this completely, but it turns out backwards compatibility was important to the market and the transistor cost of "instruction decode" just adds like +1 pipeline depth or something.

zephen 4 hours ago | parent [-]

> CPUs now dedicate a ton of silicon to decoding the CISC stream into RISC-y microcode.

In absolute terms, this is true. But in relative terms, you're talking less than 1% of the die area on a modern, heavily cached, heavily speculative, heavily predictive CPU.

zozbot234 9 hours ago | parent | prev [-]

> It will just take on the appropriate functionality to keep all the compute in the same chip.

So, an iGPU/APU? Those exist already. Regardless, the most GPU-like CPU architecture in common use today is probably SPARC, with its 8-way SMT. Add per-thread vector SIMD compute to something like that, and you end up with something that has broadly similar performance constraints to an iGPU.

spot5010 6 hours ago | parent | prev | next [-]

I don't think we get rid of the CPU. But the relationship will be inverted. Instead of the CPU calling the GPU, it might be that the GPU becomes the central controller and builds programs and calls the CPU to execute tasks.

jerf an hour ago | parent | next [-]

But... why?

How do you win moving your central controller from a 4GHz CPU to a multi-hundred-MHz single GPU core?

If we tried this, all we'd do is isolate a couple of cores in the GPU, let them run at some gigahertz, and then equip them with the additional operations they'd need to be good at coordinating tasks... or, in other words, put a CPU in the GPU.

treyd 5 hours ago | parent | prev | next [-]

This will never without completely reimagining how process isolation works and rewriting any OS you'd want to run on that architecture.

pklausler 4 hours ago | parent | prev [-]

Sounds reminiscent of the CDC 6600, a big fast compute processor with a simple peripheral processor whose barreled threads ran lots of the O/S and took care of I/O and other necessary support functions.

fc417fc802 9 hours ago | parent | prev | next [-]

> I wonder how many out there seriously think we could ever completely rid ourselves of the CPU.

How do you class systems like the PS5 that have an APU plugged into GDDR instead of regular RAM? The primary remaining issue is the limited memory capacity.

I wonder if we might see a system with GPU class HBM on the package in lieu of VRAM coupled with regular RAM on the board for the CPU portion?

chris_money202 8 hours ago | parent [-]

I don’t think the remaining issue is memory capacity. CPUs are designed to handle nonlinear memory access and that is how all modern software targeting a CPU is written. GPUs are designed for linear memory access. These are fundamentally different access patterns the optimal solution is to have 2 distinct processing units

fc417fc802 2 hours ago | parent | next [-]

GDDR has high bandwidth but limited capacity. Regular RAM is the opposite, leaving typical APUs memory bandwidth starved.

Both types of processor perform much better with linear access. Even for data in the CPU cache you get a noticable speedup.

The primary difference is that GPUs want large contiguous blocks of "threads" to do the same thing (because in reality they aren't actually independent threads).

zozbot234 8 hours ago | parent | prev [-]

If anything, GPUs combine large private per-compute unit private address spaces and a separate shared/global memory, which doesn't mesh very well with linear memory access, just high locality. You can kinda get to the same arrangement on CPU by pushing NUMA (Non-Uniform Memory: only the "global" memory is truly Unified on a GPU!) to the extreme, but that's quite uncommon. "Compute-in-memory" is a related idea that kind of points to the same constraint: you want to maximize spatial locality these days, because moving data in bulk is an expensive operation that burns power.

downrightmike 2 hours ago | parent | prev | next [-]

Mainframes still exist, so CPU isnt going anywhere. Too useful of a tool

7 hours ago | parent | prev [-]
[deleted]