Maybe because only AI enthusiasts want that much VRAM, and most of them will pony up for a higher-end GPU anyways? Everyone is suggesting it here because that's what they want, but I don't know if this crowd is really representative of broader market sentiment.

▲

vid 3 days ago | parent | next [-]

There are a lot of local AI hobbyists, just visit /r/LocalLLama to see how many are using 8GB cards, or all the people asking for higher RAM version of cards.

This makes it mysterious since clearly CUDA is an advantage, but higher VRAM lower cost cards with decent open library support would be compelling.

▲

rdos 3 days ago | parent | next [-]

There is no point in using a low-bandwidth card like the B50 for AI. Attempting to use 2x or 4x cards to load a real model will result in poor performance and low generation speed. If you don’t need a larger model, use a 3060 or 2x 3060, and you’ll get significantly better performance than the B50—so much better that the higher power consumption won’t matter (70W vs. 170W for a single card). Higher VRAM wont make the card 'better for AI'.

▲

bsder 3 days ago | parent | next [-]

> There is no point in using a low-bandwidth card like the B50 for AI.

People actually use loaded out M-series macs for some forms of AI training. So, total memory does seem to matter in certain cases.

▲

robotnikman 3 days ago | parent | prev | next [-]

>2x 3060

Are there any performance bottlenecks with using 2 cards instead of a single card? I don't think any one the consumer Nvidia cards use NVlink anymore, or at least they haven't for a while now.

▲

vid 3 days ago | parent | prev [-]

Who said anything about the B50?

Plenty of people use eg 2, 4 or 6 3090s to run large models at acceptable speeds.

Higher VRAM at decent (much faster than DDR5) speeds will make cards better for AI.

▲

wqaatwt 3 days ago | parent | next [-]

Nvidia has zero incentives to undercut their enterprise GPUs by adding more RAM to “cheap” consumer cards like the 5090.

Intel and even AMD can’t compete or aren’t bothering. I guess we’ll see how the glued 48GB B60 will do, but that’s a still relatively slow GPU regardless of memory. Might be quite competitive with Macs, though.

▲

hadlock 3 days ago | parent | prev [-]

If VRAM is ~$10/gb I suspect people paying $450 for a 12GB card would be happy to pay $1200 for a 64gb card. Running local LLM only uses about 3-6% of my GPU's capability, but all of it's VRAM. Local LLM has no need for 6 3090s to serve a single or handful of users; they just need the VRAM to run the model locally.

	▲	vid 3 days ago \| parent [-]
		Exactly. People would be thrilled with a $1200 64GB card with ok processing power and transfer speed. It's a bit of a mystery why it doesn't exist. Intel is enabling vendors to 'glue' two 24GB cards together for a $1200 list price 48GB card, but it's a frankenstein monster and will probably not be available for that price.

▲

fwipsy 3 days ago | parent | prev [-]

r/LocalLLaMA has 90,000 subscribers. r/PCMasterRace has 9,000,000. I'll bet there are a lot more casual PC gamers who don't talk about it online than there are casual local AI users, too.

	▲	fwipsy 3 days ago \| parent [-]
		Sorry. r/LocalLLaMA has 533k; r/PCMasterRce has 16m.

▲

3 days ago | parent | prev [-]

[deleted]