Remix.run Logo
whitehexagon 5 hours ago

Imagine the demand for a 128GB/256GB/512GB unified memory stuffed hardware linux box shipping with Qwen models already up and running.

Although I´m agAInst steps towards AGI, it feels safer to have these things running locally and disconnected from each other, than some giant GW cloud agentic data centers connected to everyone and everything.

buyucu 5 hours ago | parent [-]

I bought an GMKtec evo 2 that is a 128 GB unified memory system. Strong recommend.

Keyframe 2 hours ago | parent | next [-]

That's AMD Ryzen AI Max+ 395, right? Lots of those boxes popping up recently, but isn't that dog slow? And I can't believe I'm saying this - but maybe RAM filled-up mac might be a better option?

ricardobeat 14 minutes ago | parent | next [-]

Yes, but the mac costs 3-4x more. You can get one of these 395 systems with 96GB for ~1k.

Keyframe 9 minutes ago | parent [-]

When I was looking it was more like 1.6k euros, but still great price. Mac studio with M4 Max 16/40/16 with 128GB is double that. That's all within a range of "affordable". Now, if it's at least twice the speed, I don't see a reason not to. Even though my religion is against buying a mac as well.

edit, just took a look at amazon. GMKtec EVO-X2 AI, which is the AMD Ryzen AI Max+ 395 with 128GB of RAM is 3k euros. Mac M4 Max with 16 cores and 128 gigs is 4.4k euros. Damn, Europe. If you go with M4 Max with 14 cores, but still 16 cores of "Neural engine"... ah, you can't get 128 GB of RAM then. Classic Apple :)

edit2: look at gmktec site itself. machine is 2k euros there. Damn, amazon.

buyucu 39 minutes ago | parent | prev [-]

I'm not buying a Mac. Period.

te0006 2 hours ago | parent | prev [-]

Interesting - do you need to take any special measures to get OSS genAI models to work on this architecture? Can you use inference engines like Ollama and vLLM off-the-shelf (as Docker containers) there, with just the Radeon 8060S GPU? What token rates do you achieve?

(edit: corrected mistake w.r.t. the system's GPU)

buyucu 38 minutes ago | parent [-]

I just use llama.cpp. It worked out of the box.