Not to mention developer hardware. As more and more AI eats the world, more and more developers will become developers and they need machines capable of at least running quantized models. While it’s not as good as having your very own A20 or H100, the M4 Max is above all else on the desktop save the RTX5090 w/ a beast ryzen.

It’s also an opportunity to disrupt… build hardware specifically for ai tasks and reduce it down to just an asic.

▲

kcb 3 days ago | parent [-]

The vast majority by an order of magnitude of those developers will be using AI models on a server somewhere.

▲

reactordev 3 days ago | parent [-]

maybe, maybe not. I don't want to live in a world where I can't develop software without the support of a massive corporation 3rd party. I think local model inference is going to pick up. Will it ever compete with the clusters? no, but it's good enough for a solo to get work done.

I could be wrong and we could be seeing the ladders being pulled up now, moats being filled with alligators and sharks, and soon we'll have no choice but to choose between providers. I hope we can keep the hacking/tinkering culture alive when we can no longer run things ourselves.

	▲	kcb 3 days ago \| parent [-]
		There's no real physical reason to run those LLMs on the end device. LLM interaction is not particularly latency sensitive as in single digit ms like gaming. So if there's no real usability drawback from hopping over the Internet then that's the direction it will go.