Remix.run Logo
stego-tech 2 hours ago

The Unified Memory pool is what will continue to be the “game changer” in systems architecture, especially outside of data centers.

The reality is even cutting edge games and consumer workloads don’t actually take full use of the PCIe bandwidth of the GPU or the bandwidth of its GDDR memory. Even local AI use cases don’t substantially or meaningfully benefit from faster memory, at least to average consumers.

A unified memory pool does two things:

1) Lets systems optimize utilization based on need, rather than be confined to specific pools

2) Reduce overall memory cost, by letting system builders purchase a single type of memory in bulk instead of having to figure out GDDR vs DDR memory placement (important for SFF/portable machines)

So at a time when memory is expensive, unified pools make more sense. Even when memory becomes cheap and plentiful again, it’s just practical at this point to allocate a larger overall pool instead of managing discrete sets.

The one big drawback is security. A shared memory pool means side-channel attacks against memory from the GPU or CPU could potentially compromise the other as well, meaning memory-safe designs are going to be critical to security going forward (which is good for Rust adherents, I figure).

seemaze 7 minutes ago | parent | next [-]

And here I am with 128GB Strix Halo longingly eyeing the Blackwell cards that spit tokens 10-20x the speed.

The question is ultimate shape of knowledge compression and bandwidth optimization at which we arrive I suppose.

aabdi 4 minutes ago | parent | prev | next [-]

If this thing only has as much gpu bandwidth as the spark, it’s kinda pointles

Retr0id an hour ago | parent | prev | next [-]

Memory safety is orthogonal to side-channels, and hardware-enforced isolation (e.g. IOMMU) is more powerful than compiler-enforced isolation (but both are good!)

supertroop 13 minutes ago | parent | prev | next [-]

Intel was doing UMA with their i740 graphics in the late 90s. Codename TIMNA was cancelled, but they pioneered it and used it on their you/cpu chips as well as their breakthrough 810 chipset that dominated graphics market for a decade. It was despised because it wa ubiquitous and a low performing graphics engine but games had to accommodate it.

Funny that it is getting credit only now.

jmyeet an hour ago | parent | prev | next [-]

Unified memory is only a feature because NVidia so aggressively uses VRAM for market segmentation.

The 5090 ($2k MSRP but realistically $3-3.5k) is almost the same as the RTX 6000 Pro (~$10k). Same memory bandwidth (1800GB/s). Slightly different CUDA cores (21k vs 24k). Big difference? VRAM (32GB vs 96GB).

NVidia ultimately doesn't want to upset this segmentation so the RTX Spark will never undermine their other offerings. This is why I think Apple has a real market opportunity if they choose to embrace it.

dahart 25 minutes ago | parent | next [-]

I have so many questions… Since Apple already sells unified memory systems, what is the market opportunity you envision? Do you see Nvidia and Apple as competitors, and how? (And I’m not suggesting they’re not, necessarily, but I want to hear where you’re coming from, and they do have very different markets.) Hasn’t Apple used storage size (RAM & disk) for market segmentation for decades? And how does a machine with 128GB unified mem not potentially cut into some people’s reasons for wanting a 96GB GPU?

zozbot234 an hour ago | parent | prev | next [-]

Even low-VRAM cards are actually very useful for running the comparatively smaller dense layers in large local MoE models. This only requires transfering very small amounts of data across the PCIe bus (similar to pipeline parallelism) so it fits nicely around the existing bottlenecks on that hardware.

woodson 37 minutes ago | parent | prev [-]

> 5090 ($2k MSRP but realistically $3-3.5k)

These days, more like >$4.1K (at least in the US).

Asmod4n 2 hours ago | parent | prev | next [-]

yeah, you only see double digits in performance degradation from going from pcie 5 to 3 with a 5090 (at x16 speed), with everything else its like in the single digits area.

stego-tech an hour ago | parent | next [-]

And the thing we gamers forget is that we’re the outlier. We’re the edge case.

Most consumers will never really care about, let alone see, the difference in PCIe or memory bandwidth impacts from such a shift to unified memory pools. We might (being, at least in my case, a huge nerd), but I’m increasingly of the opinion that if modern blockbuster games are built for upscaling/reconstruction anyhow, then suddenly such sacrifices to performance seem acceptable relative to the gains in efficiency.

jayd16 an hour ago | parent | prev | next [-]

Well I mean, the idea with games is it all fits in vram. You really don't want to be thrashing. It's that things are still so slow that they must be avoided entirely, no?

No copy unified memory will help with that but you do pay the read speed costs.

BoredPositron an hour ago | parent | prev [-]

gen3 is 16 years old.

vlovich123 2 hours ago | parent | prev [-]

> (which is good for Rust adherents, I figure).

As a Rust adherent, please do not put words in our mouths or set up unrealistic expectations for other people by linking together concepts at a very shallow level.

Language level memory safety has no answer for hardware security flaws which is what side channel attacks are. No programming language can provide memory privacy if another chip in your machine can read your memory. Just like no programming language can protect your application from a kernel vulnerability of the kernel it’s running on.

stego-tech an hour ago | parent | next [-]

Damn. That wasn’t my intention at all, I was just pointing out that Rust has another reason to see wider adoption vis a vis the usual Valley advertising bullshit of deliberately conflating hardware security with software security. I personally give no fucks what something is written in, only that it’s written well enough that I don’t have to twist arms or babysit yet another sloppy piece of code in my enterprise.

b112 an hour ago | parent | prev [-]

But... it's rust.