| ▲ | originalvichy 8 hours ago | |||||||
As others have noted, the article is analysing the actual financial markets angle. For my two cents on the technical side, it is likely that any Western-origin shakiness will come from Apple and how it manages to land the Gemini deal and Apple Intelligence v2. There is an astounding amount of edge inference sitting in people’s phones and laptops that only slightly got cracked open with Apple Intelligence. Data centre buildouts will get corrected when the numbers come in from Apple: how large of a share in tokens used by the average consumer can be fulfilled with lightweight models and Google searches of the open internet. This will serve as a guiding principle for any future buildout and heavyweight inference cards that Nvidia is supplying. The 2-5 year moat top providers have with the largest models will get chomped at by the leisure/hobby/educational use cases that lightweight models capably handle. Small language and visual models are already amazing. The next crack will appear when the past gen cards (if they survive the around the clock operation) get bought up by second hand operators that can provide capable inference of even current gen models. If past knowledge of DC operators holds (e.g. Google and its aging TPUs that still get use), the providers with the resources to buy new space for newer gens will accumulate the amount of hardware, but the providers who need to continuously shave off the financial hit that comes with using less efficient older cards. I’m excited to see future blogs about hardware geeks buying used inference stacks and repurposing them for home use :) | ||||||||
| ▲ | notatoad 8 hours ago | parent [-] | |||||||
>when the numbers come in from Apple: how large of a share in tokens used by the average consumer can be fulfilled with lightweight models and Google searches of the open internet is there any reason to expect that this information will ever be known outside of apple? | ||||||||
| ||||||||