| ▲ | helloplanets a day ago | |
OpenAI also announced two days ago that they're starting to make Cerebras style chips themselves [0], will be interesting to see how fast SotA model inference will be by the end of the year. [0]: https://openai.com/index/openai-broadcom-jalapeno-inference-... | ||
| ▲ | mlyle a day ago | parent | next [-] | |
I don't understand how you refer to this as "Cerebras-style". Cerebras is wafer-scale and unique. Jalapeno is an inference-optimized conventional chip. | ||
| ▲ | WarmWash a day ago | parent | prev | next [-] | |
Cerebras is different than what jalapeno is. Jalepeno is for mass scale inference. Cerebras is extremely expensive and difficult to scale, hence the limited release. | ||
| ▲ | paxys a day ago | parent | prev | next [-] | |
Even if their chip is a difference maker, end of the year is wayy too optimistic. It’ll at minimum be a multi-year effort to bring it to production at scale. | ||
| ▲ | jauntywundrkind a day ago | parent | prev [-] | |
I don't see any indications that OpenAI is doing wafer-scale work. I tend to doubt they would. Cerebras notably doesn't have a kv, is wildly high bandwidth, but within/across the chip, not able to dump/restore kv super well. I doubt openai is going to build something that is as expensive to run. Also, wafer-scale is absurdly hard & weird to pull off, so I doubt that would be their first foray. | ||