Remix.run Logo
semi-extrinsic 3 days ago

For $3000 you can get 3x used Epyc servers with a total of 144 cores and 384 GB memory, with dual-port 25Gbe networking so you can run them in a fully connected cluster without a switch. It will have >20x better perf/$ and ~3x better perf/W.

That combo gives you the better part of a gigabyte of L3 cache and an aggregate memory bandwidth of 600 GB/s, while still below 1000W total running at full speed. Plus your NICs are the fancy kind that let you play around with RoCEv2 and such nifty stuff.

It would also be relevant to then also learn how to do stuff properly with SLURM and Warewulf etc. instead of a poor mans solution with Ansible playbooks like in these blog posts.

p12tic 3 days ago | parent | next [-]

Better build a single workstation - less noise, less power usage and the form factor is way more convenient. A budget of $3000 can buy 128 cores with 512GB of RAM on a single regular EATX motherboard, a case, a power supply and other accessories. Power usage is ~550W at maximum utilization which not much more than a gaming rig with a powerful GPU.

Almondsetat 3 days ago | parent | prev [-]

You are taking my reply completely out of context. If you want to learn clustering, you need a lot of cores and ram to run many VMs. You don't need them to be individually very powerful.