We can run open weight models on our own machines.

yes, but a model that runs on my own machine will never have the capacity of a model that runs in a datacenter. as i said, it can't compete with that.

▲

randbyte 2 hours ago | parent | next [-]

> a model that runs on my own machine will never have the capacity of a model that runs in a datacenter.

I don’t think so. A local run model only needs to serve one or a few people. It seems possible to run a DeepSeek v4 model at full capacity on a server costing 200k usd. Very expensive but not impossible.

Factor in hardware and software improvements over time, and the fact that most people may just need to run a smaller and quantized model, it should take a pc at 10k usd scale.

▲

thewebguyd 4 hours ago | parent | prev [-]

If RAM prices ever come down, you can have a machine that can run a capable local model.

Qwen 2.5 72B is surprisingly capable, almost on par with GPT-4o if not a little better. You can run it on a 128GB Mac Studio with 8-bit quantization. You need about 77GB for the weights and ~15GB for your context window & cache.

Pricing remains to be seen, but there's also those new nvidia laptops coming out the surface laptop ultra should have 128GB RAM w/ Blackwell GPU, they're saying 1 petaflop of AI compute, if you can tolerate Windows (no idea if it'll boot Linux until the hardware is out).

These models are roughly ~1 year or less behind the frontier models. We really just need hardware to catch up and alleviate the price pressure on RAM.

	▲	rustcleaner an hour ago \| parent [-]
		>If RAM prices ever come down Maybe an unpopular opinion here (seening how Y-combinator is his baby), but I think OpenAI and Sam Altman should be financially decimated for cornering the DRAM market. What he's done is a step or two removed from what the Hunt brothers did. His buy-up of future DRAM silicon has measurably harmed personal computing, and he should not get to walk away with a 'win' from it.