▲ | alephnerd 4 days ago | |
From what I'm hearing in my network, the name of the game is custom chips hyperoptimized for your own workloads. A major reason Deepseek was so successful margins wise was because the team heavily understood Nvidia, CUDA, and Linux internals. If you have an understanding of the intricacies of your custom ASIC's architecture, it's easier for you to solve perf issues, parallelize, and debug problems. And then you can make up the cost by selling inference as a service. > Amazon and I think Microsoft are also working on their own NVIDIA replacement chips Not just them. I know of at least 4-5 other similar initiatives (some public like OpenAI's, another which is being contracted by a large nation, and a couple others which haven't been announced yet so I can't divulge). Contract ASIC and GPU design is booming, and Broadcom, Marvell, HPE, Nvidia, and others are cashing in on it. | ||
▲ | coredog64 4 days ago | parent [-] | |
I wouldn't be surprised if a fair portion of Amazon's Bedrock traffic is being served by Inferentia silicon. Their margins on Anthropic models are razor thin and there's a lot of traffic, so there's definitely an incentive. Additionally, every model that's served by Inferentia frees up Nvidia capacity for either models that can't be so served or for selling to customers. |