Remix.run Logo
rocqua 4 days ago

I believe that VRAM has massively shot up in price, so this is where a large part of the costs are. Besides I wouldn't be very surprised if Nvidia has such strong market share they can effectively tell suppliers to not let others sell high capacity cards. Especially because VRAM suppliers might worry about ramping up production too much and then being left with an oversupply situation.

kokada 4 days ago | parent | next [-]

This could well be the reason why the rumored RDNA5 will use LPDDR5X/LPDDR5X instead of GDDR7 memory, at least for the low/mid range configurations (the top-spec and enthusiast configurations AT0 and AT2 configurations will still use GDDR7 it seems).

FuriouslyAdrift 3 days ago | parent [-]

AFAIK, RDNA5 has been cancelled as AMD is moving back to a unified architecture with their Instinct and Radeon lines.

kokada 2 days ago | parent [-]

It is not really clear if it will be called as UDNA or RDNA5, I was just referring to the next-gen graphics architecture from AMD and referring as RDNA5 is just clearer that this is the next-gen architecture.

GTP 4 days ago | parent | prev [-]

> Especially because VRAM suppliers might worry about ramping up production too much and then being left with an oversupply situation.

Given the high demand of graphic cards, is this a plausible scenario?

williamdclt 3 days ago | parent [-]

I don't really know what I'm talking about (whether about graphic cards or in AI inference), but if someone figures out how to cut the compute needed for AI inference significantly then I'd guess the demand for graphic cards would suddenly drop?

Given how young and volatile this domain still is, it doesn't seem unreasonable to be wary of it. Big players (google, openai and the likes) are probably pouring tons of money into trying to do exactly that

rtrgrd 3 days ago | parent [-]

I would suspect that for self hosted LLMs, quality >>> performance, so the newer releases will always expand to fill capacity of available hardware even when efficiency is improved.