Remix.run Logo
m-hodges 7 hours ago

As frontier models get closer and closer to consumer hardware, what’s the most for the API-driven $trillion labs?

stri8ted 7 hours ago | parent | next [-]

48 GB is not consumer hardware. But fundamentally, there are economies of scale due to batching, power distribution, better utilization etc.., that means data center tokens will be cheaper. Also, as the cost of training (frontier) models increases, it's not clear the Chinese companies will continue open sourcing them. Notice for example, that Qwen-Max is not open source.

zozbot234 7 hours ago | parent | next [-]

Nothing obviously prevents using this approach, e.g. for 3B-active or 10B-active models, which do run on consumer hardware. I'd love to see how the 3B performs with this on the MacBook Neo, for example. More relevantly, data-center scale tokens are only cheaper for the specific type of tokens data centers sell. If you're willing to wait long enough for your inferences (and your overall volume is low enough that you can afford this) you can use approaches like OP's (offloading read-only data to storage) to handle inference on low-performing, slow "edge" devices.

m-hodges 4 hours ago | parent | prev | next [-]

> 48 GB is not consumer hardware.

It’s a MacBook.

WesolyKubeczek an hour ago | parent | prev [-]

It is consumer hardware in the sense that Macbook Pros come with this RAM size as base and that you can buy them as a consumer, without having to sign a special B2B contract, show that your company is big and reputable enough, and order a minimum of 10 or 100.

OJFord 7 hours ago | parent | prev | next [-]

Assuming 'moat' – they'll push the frontier forward; they don't really have to worry until progress levels off.

At that point, I suppose there's still paid harnesses (people have always paid for IDEs despite FOSS options) partly for mindshare, and they could use expertise & compute capacity to provide application-specific training for enterprises that need it.

BoredomIsFun 6 hours ago | parent | prev [-]

> the API-driven $trillion labs?

here we go: https://huggingface.co/collections/trillionlabs/tri-series