I've built a rig with 14 of them. NVLink is not 'an absolute must', it can be useful depending on the model and the application software you use and whether you're training or inferring.

The most important figure is the power consumed per token generated. You can optimize for that and get to a reasonably efficient system, or you can maximize token generation speed and end up with two times the power consumption for very little gain. You also will likely need to have a way to get rid of excess heat and all those fans get loud. I stuck the system in my garage, that made the noise much more manageable.

▲

breakds 4 days ago | parent [-]

I am curious about the setup of 14 GPUs - what kind of platform (motherboard) do you use to support so many PCIe lanes? And do you even have a chassis? Is it rack-mounted? Thanks!

▲

jacquesm 4 days ago | parent [-]

I used a large supermicro server chassis, a dual Xeon motherboard with 7 8 lane PCI Express slots, all the ram it would take (bought second hand), splitters, four massive powersupplies. I extended the server chassis with aluminum angle riveted onto the base. It could be rack mounted but I'd hate to be the person lifting it in. The 3090s were a mix, 10 of the same type (small, and with blower style fans on them) and 4 much larger ones that were kind of hard to accommodate (much wider and longer). I've linked to the splitter board manufacturer in another comment in this thread. That's the 'hard to get' component but once you have those and good cables to go with them the remaining setup problems are mostly power and heat management.

▲

breakds 3 days ago | parent [-]

Thanks that is very inspiring. I thought there are no blower type consumer GPUs, but apparently they exist!

	▲	jacquesm 3 days ago \| parent [-]
		I got them second hand off some bitcoin mining guy. https://www.tomshardware.com/news/asus-blower-rtx3090 Is the model that I have.