Remix.run Logo
vid 3 days ago

For the past 30 odd years I've hand picked and built a desktop PC (which also acts as a home server) pretty much every year, selling the "old" one each time. I really enjoy it as a hobby and for the benefits of understanding and optimizing a system at the parts level. Even though there is a lot of nonsense created by all the choices and marketing, I really prefer the parts approach, and am happy with Linux so a Mac isn't very appealing. A perfectly designed PC can do tasks very well with optimized parts and at a much better price.

But I just can't bring myself to upgrade this year. I dabble in local AI, where it's clear fast memory is important, but the PC approach is just not keeping up without going to "workstation" or "server" parts that cost too much.

There are glimmers of hope with MR-DIMMs CU-DIMM, and other approaches, but really boards and CPUs need to support more memory channels. Intel has a small advantage over AMD, but it's nothing compared to the memory speed of a Mac Pro or higher. "Strix Halo" offers some hope with four memory channel support, but it's meant for notebooks so isn't really expandable (which would enable à la carte hybrid AI; fast GPUs with reasonably fast shared system RAM).

I wish I could fast forward to a better time, but it's likely fully integrated systems will dominate if the size and relatively weak performance for some tasks makes the parts industry pointless. It is a glaring deficiency in the x86 parts concept and will result in PC parts being more and more niche, exotic and inaccessible.

827a 3 days ago | parent | next [-]

To be honest, much of the sense that Apple is ridiculously far ahead when it comes to unified memory SoC architectures comes from people who aren't actually invested in any kind of non-Nvidia local AI development to the degree where you'd actually notice a difference (either the AMD AI Max platform or Apple Silicon Ultra). Because if you were, you'd realize that the grass isn't greener on these unified memory platforms, and no one in the industry has a product that can compete with Nvidia on any vertical except "things for Jeff Geerling to make a video about".

vid 3 days ago | parent [-]

People are running GPT OSS 120b at 46 tokens per second on Strix Halo systems, which is quite usable and a fraction of the cost of a 128GB NVidia or Apple system. Apple's GPU isn't that strong, so real competition to Apple and NVidia can be created.

827a 3 days ago | parent [-]

Exactly yeah, my point is that there's a lot more to running these models than just the raw memory bandwidth and GPU-available memory size, and the difference between a $6000 M4 Ultra Mac Studio and a $2000 AI Max 395+ isn't actually as big as the raw numbers would suggest.

On the flip-side, though: Running GPT-OSS-120b locally is "cool", but have people found useful, productivity enhancing use-cases which justify doing this over just loading $2000 into your OpenAI API account? That, I'm less sure of.

I think we'll get to the point where running a local-first AI stack is obviously an awesome choice; I just don't think the hardware or models are there yet. Next-year's Medusa Halo, combined with another year of open source model improvements might be the inflection point.

vid 3 days ago | parent [-]

I use local AI fairly often for innocuous queries (health, history, etc) I don't want to feed the spy machines plus I like the hands on aspect, I would use it more if I had more time and while I hear the 120b is pretty good (I mostly use qwen 30b), I would use it a lot more if I could run some of the really great models. Hopefully Medusa Halo will be all that.

gsibble 3 days ago | parent | prev | next [-]

I think most parts are geared towards gaming these days. When I've needed a server, I went for multi-CPU setups with older, cheaper CPUs.

That being said, for AI, HEDT is the obvious answer. Back in the day, it was much more affordable with my 9980XE only costing $2,000.

I just built a Threadripper 9980 system with 192GB of RAM and good lord it was expensive. I will actually benefit from it though and the company paid for it.

That being said, there is a glaring gap between "consumer" hardware meant for gaming and "workstation" hardware meant for real performance.

Have you looked into a 9960 Threadripper build? The CPU isn't TOO expensive, although the memory will be. But you'll get a significantly faster and better machine than something like a 9950X.

I also think besides the new Threadripper chips, there isn't much new out this year anyways to warrant upgrading.

vid 3 days ago | parent [-]

I have looked into the Threadripper, but just can't justify it. The tension between all the options and the cost, power usage, size (EATX) is too much, and I don't think such a system, especially with 2025 era DDR5 in the 6000mt range, will hold its value well. If I were directly earning money with it, sure, but as a hobby/augmentation to my work, I will wait out a generation or lose interest in the pursuit.

Competitors to NVidia really need to figure things out, even for gaming with AI being used more I think a high end APU would be compelling with fast shared memory.

stillsut 3 days ago | parent | prev | next [-]

At a meta-level, I wonder if there's this un-talked about advantage of poaching ambitious talent out of an established incumbent to work a new product line in a new organization, in this case Apple Silicon disrupting Intel/AMD. And we've also seen SpaceX do this NASA/Boeing, and OpenAI do it to Google's ML departments.

It seems like large, unchallenged organizations like Intel (or NASA or Google) collect all the top talent out of school. But changing budgets, changing business objectives, frozen product strategies make it difficult for emerging talent to really work on next-generation technology (those projects have already been assigned to mid-career people who "paid their dues").

Then someone like Apple Silicon with M-chip or SpaceX with Falcon-9 comes along and poaches the people most likely to work "hardcore" (not optimizing for work/life balance) while also giving the new product a high degree of risk tolerance and autonomy. Within a few years, the smaller upstart organization has opened up in un-closeable performance gap with behemoth incumbent.

Has anyone written about this pattern (beyond Innovator's Dilemma)? Does anyone have other good examples of this?

vid 3 days ago | parent [-]

I'm not sure it really takes that kind of breakthrough approach. Apple chips are more energy efficient, but x86 can be much faster on CPU or GPU tasks, and it's much more versatile. A main "bug and feature" issue is the PC industry relies on common denominator standards and components, whereas Apple has gone vertical with very limited core expansion. This is particularly important when it comes to memory speed, where the standards are developed and factories upgraded over years at huge cost.

I gather it's very difficult and expensive to make a board that supports more channels of RAM, so that seems worth targeting at the platform level. Eight channel RAM using common RAM DIMMs would transform PCs for many tasks, however for now gamers are a main force and they don't really care about memory speed.

stillsut 3 days ago | parent [-]

Makes sense: M-chips, Falcon-9, GPT's are product subsets or the incumbent's traditional product capabilities.

natch 3 days ago | parent | prev [-]

Apple and unified memory seems great but losing CUDA seem like a big downside.

How do you sell your systems when their time comes?

vid 3 days ago | parent [-]

I post them with good descriptions on local sites and Facebook marketplace (sigh) and wait for the right buyer. Obviously for less than what I paid, but top end parts can usually get a good price, I got a year of enjoyment out of it, and it's not going to landfill.