Remix.run Logo
YasuoTanaka 6 hours ago

128GB of unified memory is a dream come true for local LLMs. VRAM has been the ultimate bottleneck for developers.

adrian_b 5 hours ago | parent | next [-]

The competitor for this NVIDIA CPU will not be the now old AMD Strix Halo, but its successor (launched recently), which supports up to 192 GB of unified memory. Thus 128 GB is no longer SOTA.

While this NVIDIA system is inferior from the point of view of the memory capacity, its main advantage is that the top models will have a bigger GPU, i.e. with 6144 or 5120 FP32 execution units, compared to 2560 for the AMD GPU (compared to the NVIDIA CPU, the AMD CPU has a better multi-threaded performance for legacy programs, and a much better multi-threaded performance for the applications that use AVX-512).

However, these top models with big GPUs will also be much more expensive than the competing AMD system, while also being much more expensive than a laptop or mini-PC with an equivalent discrete NVIDIA GPU (which has the disadvantage of having direct access only to a much smaller, even if faster, memory).

christkv 5 hours ago | parent [-]

I don’t think there is much improvement in compute for the new strix halo revision. The next one supposedly adds rdna4 cores or similar and more memory channels

zamadatix 5 hours ago | parent | prev | next [-]

I have a 128 GB LPDDR5X machine. It's a great workstation laptop (which is why I got it) but the memory bandwidth is just awful if you're wanting to use it for AI. An old Epyc CPU will fair better both in terms of being able to run full sized larger models as well as having higher memory bandwidth, and that's not a recommendation to go that route either as it's still not worth it.

avocadoking 5 hours ago | parent | prev | next [-]

It could help with exploding external LLM costs. Interesting to see how the adaption will be, which will mainly depend on the price.

SwtCyber 5 hours ago | parent | prev | next [-]

This is what makes it interesting to me as well

zackify 5 hours ago | parent | prev [-]

[dead]