I think that DeepSeek may be important to that. They have a really good model that's open source, raising the bar for all other players: how good your model needs to be so you can make meaningful money on it (better than DeepSeek).

Same thing happened on other places the open source offering became popular.

▲

mchusma 18 hours ago | parent | next [-]

I think the original DeepSeek moment seemed important. And yes, the more recent model is good, but there are multiple. This commodification trend spans many different companies, including Kimi 2.5/2.6 and GLM5.1, and even Google itself with its Gemma models. There are a dozen models that exist at roughly the frontier from 6 months ago at 1/10th the cost.

	▲	mistrial9 17 hours ago \| parent [-]
		> that exist at roughly the frontier no disagree, specifics matter.. There are a dozen well-defined LLM application subject areas that are regularly tested.. one overall grade IMO lacks important detail.. To go a bit abstract, it is ironic that "oversimplification" in the discussion of these complex machines mirrors the effects on information of the automations themselves.. constantly simplifying, substituting and diluting real meaning

▲

dist-epoch 20 hours ago | parent | prev [-]

What good is an open-weights DeepSeek model if you have nowhere to run it?

OpenAI / Google / Anthropic / XAI also have a ton of compute. That is the real moat.

▲

eli 19 hours ago | parent | next [-]

It's quite expensive to self-host but you have many places to run it. OpenRouter alone lists a dozen different providers for DeepSeek 4 Pro. https://openrouter.ai/deepseek/deepseek-v4-pro/providers.

So long as there is demand, there are always going to be providers competing to offer it at a low cost. My understanding is that the median price on there is in the ballpark of what it costs to run the inference. This is very different from e.g. Opus, which you can basically only buy from Anthropic at the price they set.

▲

nmfisher 20 hours ago | parent | prev | next [-]

antirez running (quantized) DeepSeek V4 Pro on a Mac Studio M3 Ultra with 512GB of RAM:

https://bsky.app/profile/antirez.bsky.social/post/3mlzwmvlov...

It's much closer than you think. We're going to see specialized hardware in the next 24 months capable of running 2025-era frontier models. That's big.

▲

menaerus 33 minutes ago | parent | next [-]

2-bit quantization? That's a lot of signal being removed. Considering how quickly the AI models are progressing in their capabilities (still exponential curve), I will not want to use the 2025 model in two years time. Similarly, how I don't want to use llama-3 or old Anthropic model from 2023 or 2024. Newer models are so much better that it makes it very difficult to ignore.

Once and if the advancements with the AI models slow down, only then IMHO it will become feasible to design the specialized HW for general-purpose consumption and general-purpose workloads.

▲

treis 18 hours ago | parent | prev | next [-]

It's big because it may take a big swath of people who will actually pay for LLMs out of the market. But for the average consumer they're going to primarily use their phone/tablet and we're far away from that being possible.

Even if it were possible the LLMs are such a gold mine of user data. It's really hard to see that opportunity be passed up.

▲

18 hours ago | parent | prev | next [-]

[deleted]

▲

dist-epoch 19 hours ago | parent | prev [-]

That specialized hardware will be scooped up by AI data-centers, just like RAM is today.

	▲	nine_k 19 hours ago \| parent \| next [-]
		No more than Mac Studios. Datacenters need different hardware.
	▲	ffsm8 19 hours ago \| parent \| prev [-]
		The 512 GB ram studio can't even be purchased anymore. It's been delisted https://www.apple.com/shop/buy-mac/mac-studio Same with the Mac mini. entirely removed from all store references

▲

wolttam 19 hours ago | parent | prev | next [-]

I just got into self hosting Deepseek v4 Flash on a single DGX Spark via antirez’s DwarfStar 4 project

It feels great to finally have access to something local.

▲

amanaplanacanal 20 hours ago | parent | prev [-]

That seems pretty temporary if people can just build more compute.