|
| ▲ | fwipsy 3 days ago | parent | next [-] |
| Maybe because only AI enthusiasts want that much VRAM, and most of them will pony up for a higher-end GPU anyways? Everyone is suggesting it here because that's what they want, but I don't know if this crowd is really representative of broader market sentiment. |
| |
| ▲ | vid 3 days ago | parent | next [-] | | There are a lot of local AI hobbyists, just visit /r/LocalLLama to see how many are using 8GB cards, or all the people asking for higher RAM version of cards. This makes it mysterious since clearly CUDA is an advantage, but higher VRAM lower cost cards with decent open library support would be compelling. | | |
| ▲ | rdos 3 days ago | parent | next [-] | | There is no point in using a low-bandwidth card like the B50 for AI. Attempting to use 2x or 4x cards to load a real model will result in poor performance and low generation speed. If you don’t need a larger model, use a 3060 or 2x 3060, and you’ll get significantly better performance than the B50—so much better that the higher power consumption won’t matter (70W vs. 170W for a single card). Higher VRAM wont make the card 'better for AI'. | | |
| ▲ | bsder 3 days ago | parent | next [-] | | > There is no point in using a low-bandwidth card like the B50 for AI. People actually use loaded out M-series macs for some forms of AI training. So, total memory does seem to matter in certain cases. | |
| ▲ | robotnikman 3 days ago | parent | prev | next [-] | | >2x 3060 Are there any performance bottlenecks with using 2 cards instead of a single card? I don't think any one the consumer Nvidia cards use NVlink anymore, or at least they haven't for a while now. | |
| ▲ | vid 3 days ago | parent | prev [-] | | Who said anything about the B50? Plenty of people use eg 2, 4 or 6 3090s to run large models at acceptable speeds. Higher VRAM at decent (much faster than DDR5) speeds will make cards better for AI. | | |
| ▲ | wqaatwt 3 days ago | parent | next [-] | | Nvidia has zero incentives to undercut their enterprise GPUs by adding more RAM to “cheap” consumer cards like the 5090. Intel and even AMD can’t compete or aren’t bothering. I guess we’ll see how the glued 48GB B60 will do, but that’s a still relatively slow GPU regardless of memory. Might be quite competitive with Macs, though. | |
| ▲ | hadlock 3 days ago | parent | prev [-] | | If VRAM is ~$10/gb I suspect people paying $450 for a 12GB card would be happy to pay $1200 for a 64gb card. Running local LLM only uses about 3-6% of my GPU's capability, but all of it's VRAM. Local LLM has no need for 6 3090s to serve a single or handful of users; they just need the VRAM to run the model locally. | | |
| ▲ | vid 3 days ago | parent [-] | | Exactly. People would be thrilled with a $1200 64GB card with ok processing power and transfer speed. It's a bit of a mystery why it doesn't exist. Intel is enabling vendors to 'glue' two 24GB cards together for a $1200 list price 48GB card, but it's a frankenstein monster and will probably not be available for that price. |
|
|
| |
| ▲ | fwipsy 3 days ago | parent | prev [-] | | r/LocalLLaMA has 90,000 subscribers. r/PCMasterRace has 9,000,000. I'll bet there are a lot more casual PC gamers who don't talk about it online than there are casual local AI users, too. | | |
| |
| ▲ | 3 days ago | parent | prev [-] | | [deleted] |
|
|
| ▲ | blkhawk 4 days ago | parent | prev | next [-] |
| because the cards already sell at very very good prices with 16GB and optimizations in generative AI is bringing down memory requirements. Optimizing profits means yyou sell with the least amount of VRAM possible not only to save the direct cost of the RAM but also to guard future profit and your other market segments. the cost of the ram itself is almost nothing compared to that. any intel competitor can more easily release products with more than 16GB and smoke them. Intel tries for a market segment that was only served by gaming cards twice as expensive up until now. this frees those up to be finally sold at MSRP. |
| |
| ▲ | betimsl 3 days ago | parent [-] | | Right, but Intel is in no position to do that. So, they could play and put 32GB VRAM and also, why not produce one with 64GB just for kicks? | | |
| ▲ | Workaccount2 3 days ago | parent [-] | | If intel was serious about staging a comeback, they would release a 64GB card. But intel is still lost in it's hubris, and still thinks it's a serious player and "one of the boys", so it doesn't seem like they want to break the line. |
|
|
|
| ▲ | bsder 3 days ago | parent | prev | next [-] |
| > If it is that easy to compete with Nvidia, why don't we already have those cards? Businesswise? Because Intel management are morons. And because AMD, like Nvidia, don't want to cannibalize their high end. Technically? "Double the RAM" is the most straightforward (that doesn't make it easy, necessarily ...) way to differentiate as it means that training sets you couldn't run yesterday because it wouldn't fit on the card can now be run today. It also takes a direct shot at how Nvidia is doing market segmentation with RAM sizes. Note that "double the RAM" is necessary but not sufficient. You need to get people to port all the software to your cards to make them useful. To do that, you need to have something compelling about the card. These Intel cards have nothing compelling about them. Intel could also make these cards compelling by cutting the price in half or dropping two dozen of these cards on every single AI department in the US for free. Suddenly, every single grad student in AI will know everything about your cards. The problem is that Intel institutionally sees zero value in software and is incapable of making the moves they need to compete in this market. Since software isn't worth anything to Intel, there is no way to justify any business action isn't just "sell (kinda shitty) chip". |
|
| ▲ | rocqua 4 days ago | parent | prev | next [-] |
| I believe that VRAM has massively shot up in price, so this is where a large part of the costs are. Besides I wouldn't be very surprised if Nvidia has such strong market share they can effectively tell suppliers to not let others sell high capacity cards. Especially because VRAM suppliers might worry about ramping up production too much and then being left with an oversupply situation. |
| |
| ▲ | kokada 4 days ago | parent | next [-] | | This could well be the reason why the rumored RDNA5 will use LPDDR5X/LPDDR5X instead of GDDR7 memory, at least for the low/mid range configurations (the top-spec and enthusiast configurations AT0 and AT2 configurations will still use GDDR7 it seems). | | |
| ▲ | FuriouslyAdrift 3 days ago | parent [-] | | AFAIK, RDNA5 has been cancelled as AMD is moving back to a unified architecture with their Instinct and Radeon lines. | | |
| ▲ | kokada 2 days ago | parent [-] | | It is not really clear if it will be called as UDNA or RDNA5, I was just referring to the next-gen graphics architecture from AMD and referring as RDNA5 is just clearer that this is the next-gen architecture. |
|
| |
| ▲ | GTP 4 days ago | parent | prev [-] | | > Especially because VRAM suppliers might worry about ramping up production too much and then being left with an oversupply situation. Given the high demand of graphic cards, is this a plausible scenario? | | |
| ▲ | williamdclt 3 days ago | parent [-] | | I don't really know what I'm talking about (whether about graphic cards or in AI inference), but if someone figures out how to cut the compute needed for AI inference significantly then I'd guess the demand for graphic cards would suddenly drop? Given how young and volatile this domain still is, it doesn't seem unreasonable to be wary of it. Big players (google, openai and the likes) are probably pouring tons of money into trying to do exactly that | | |
| ▲ | rtrgrd 3 days ago | parent [-] | | I would suspect that for self hosted LLMs, quality >>> performance, so the newer releases will always expand to fill capacity of available hardware even when efficiency is improved. |
|
|
|
|
| ▲ | robotnikman 3 days ago | parent | prev | next [-] |
| There does seem to be a grey market for it in China. You can buy cards where they swap the memory modules with higher capacity ones on Aliexpress and ebay. |
|
| ▲ | danielEM 3 days ago | parent | prev | next [-] |
| Ryzen AI max+ 395 128GB can do 256GBps so lets put all these "ifs" to bed once for all. That is absolutely no brainer to drop more RAM as long as there is enough bits in address space of physical hardware. And there usually is, as same silicons are branded and packaged differently for commercial market and for consumer market. Check up how chinese are doubling 4090s RAM from 24 to 48GB. |
|
| ▲ | PunchyHamster 3 days ago | parent | prev | next [-] |
| Because if they do their higher models will sell worse. |
|
| ▲ | qudat 3 days ago | parent | prev | next [-] |
| I'm willing to bet there are technical limitations to just "adding more VRAM" to all these boards. |
|
| ▲ | agilob 4 days ago | parent | prev | next [-] |
| AMD is growing up into doing that, they have a few decent cards with 20Gb and 24Gb now |
|
| ▲ | izacus 3 days ago | parent | prev [-] |
| Because: - less people care about VRAM than HN commenters give impression of - VRAM is expensive and wouldn't make such cards profitable at the HN desired price points |