| ▲ | Loic an hour ago |
| I think the OpenAI deal to lock wafers was a wonderful coup. OpenAI is more and more losing ground against the regularity[0] of the improvements coming from Anthropic, Google and even the open weights models. By creating a chock point at the hardware level, OpenAI can prevent the competition from increasing their reach because of the lack of hardware. [0]: For me this is really an important part of working with Claude, the model improves with the time but stay consistent, its "personality" or whatever you want to call it, has been really stable over the past versions, this allows a very smooth transition from version N to N+1. |
|
| ▲ | Grosvenor an hour ago | parent | next [-] |
| Could this generate pressure to produce less memory hungry models? |
| |
| ▲ | hodgehog11 an hour ago | parent | next [-] | | There has always been pressure to do so, but there are fundamental bottlenecks in performance when it comes to model size. What I can think of is that there may be a push toward training for exclusively search-based rewards so that the model isn't required to compress a large proportion of the internet into their weights. But this is likely to be much slower and come with initial performance costs that frontier model developers will not want to incur. | | |
| ▲ | thisrobot 9 minutes ago | parent | next [-] | | I wonder if this maintains the natural language capabilities which are what LLM's magic to me. There is a probably some middle ground, but not having to know what expressions, or idiomatic speech an LLM will understand is really powerful from a user experience point of view. | |
| ▲ | Grosvenor 36 minutes ago | parent | prev | next [-] | | Yeah that was my unspoken assumption. The pressure here results in an entirely different approach or model architecture. If openAI is spending $500B then someone can get ahead by spending $1B which improves the model by >0.2% I bet there's a group or three that could improve results a lot more than 0.2% with $1B. | |
| ▲ | UncleOxidant 36 minutes ago | parent | prev | next [-] | | Or maybe models that are much more task-focused? Like models that are trained on just math & coding? | |
| ▲ | parineum 31 minutes ago | parent | prev [-] | | > so that the model isn't required to compress a large proportion of the internet into their weights. The knowledge compressed into an LLM is a byproduct of training, not a goal. Training on internet data teaches the model to talk at all. The knowledge and ability to speak are intertwined. |
| |
| ▲ | lofaszvanitt 41 minutes ago | parent | prev [-] | | Of course and then watch those companies reined in. |
|
|
| ▲ | hodgehog11 an hour ago | parent | prev | next [-] |
| I don't see this working for Google though, since they make their own custom hardware in the form of the TPUs. Unless those designs include components that are also susceptible? |
| |
| ▲ | jandrese 15 minutes ago | parent | next [-] | | That was why OpenAI went after the wafers, not the finished products. By buying up the supply of the raw materials they bottleneck everybody, even unrelated fields. It's the kind of move that requires a true asshole to pull off, knowing it will give your company an advantage but screw up life for literally billions of people at the same time. | |
| ▲ | frankchn an hour ago | parent | prev | next [-] | | TPUs use HBM, which are impacted. | |
| ▲ | bri3d an hour ago | parent | prev | next [-] | | Still susceptible, TPUs need DRAM dies just as much as anything else that needs to process data. I think they use some form of HBM, so they basically have to compete alongside the DDR supply chain. | |
| ▲ | UncleOxidant 39 minutes ago | parent | prev [-] | | Even their TPU based systems need RAM. |
|
|
| ▲ | codybontecou an hour ago | parent | prev | next [-] |
| This became very clear with the outrage, rather than excitement, of forcing users to upgrade to ChatGPT-5 over 4o. |
|
| ▲ | lysace 37 minutes ago | parent | prev | next [-] |
| Please explain to me like I am five: Why does OpenAI need so much RAM? 2024 production was (according to openai/chatgpt) 120 billion gigabytes. With 8 billion humans that's about 15 GB per person. |
| |
| ▲ | mebassett 11 minutes ago | parent [-] | | large language models are large and must be loaded into memory to train or to use for inference if we want to keep them fast. older models like gpt3 have around 175 billion parameters. at float32s that comes out to something like 700GB of memory. newer models are even larger. and openai wants to run them as consumer web services. | | |
| ▲ | lysace 8 minutes ago | parent [-] | | I mean, I know that much. The numbers still don't make sense to me. How is my internal model this wrong? For one, if this was about inference, wouldn't the bottleneck be the GPU computation part? |
|
|
|
| ▲ | hnuser123456 an hour ago | parent | prev [-] |
| Sure, but if the price is being inflated by inflated demand, then the suppliers will just build more factories until they hit a new, higher optimal production level, and prices will come back down, and eventually process improvements will lead to price-per-GB resuming its overall downtrend. |
| |
| ▲ | malfist an hour ago | parent | next [-] | | Micron has said they're not scaling up production. Presumably they're afraid of being left holding the bag when the bubble does pop | | |
| ▲ | fullstop 13 minutes ago | parent | next [-] | | Why are they building a foundry in Idaho? https://www.micron.com/us-expansion/id | | |
| ▲ | delfinom 3 minutes ago | parent [-] | | Future demand aka DDR6. The 2027 timeline for the fab is when DDR6 is due to hit market. |
| |
| ▲ | Analemma_ an hour ago | parent | prev [-] | | Not just Micron, SK Hynix has made similar statements (unfortunately I can only find sources in Korean). DRAM manufacturers got burned multiple times in the past scaling up production during a price bubble, and it appears they've learned their lesson (to the detriment of the rest of us). |
| |
| ▲ | mholm an hour ago | parent | prev | next [-] | | Chip factories need years of lead time, and manufacturers might be hesitant to take on new debt in a massive bubble that might pop before they ever see any returns. | |
| ▲ | nutjob2 an hour ago | parent | prev [-] | | Memory fabs take billions of dollars and years to build, also the memory business is a tough one where losses are common, so no such relief in sight. With a bit of luck OpenAI collapses under its own weight sooner than later, otherwise we're screwed for several years. |
|