Remix.run Logo
pmontra 2 hours ago

Agreed but I want to see how it plays out. Historically a good Windows computer cost $1000 and it was all it took to start programming. How much does it cost a computer with enough resources to run a good enough AI model for agentic workflows and a reasonable time to first token? Can "most of the world" afford buying one?

wizee 2 hours ago | parent | next [-]

Qwen 3.6 27B is quite good for agentic coding, and practical to run on consumer hardware. You need a system with either 32+ GB VRAM, or a unified memory system with 48+ GB VRAM and a decent integrated GPU. While not cheap, such a setup is still attainable for much of the world, and will eventually get cheaper over time. Open models hosted on non-American clouds also remain an option with a much lower barrier to entry, for cases where privacy is less critical.

jochem9 2 hours ago | parent | next [-]

There was an article on HN a few weeks ago where someone detailed how they managed to get an old datacenter GPU to run in their consumer PC, getting decent performance with qwen. He spent something like $200 on the GPU (second hand of course).

So yeah, I think models on local hardware will be quite common soon among the tech savvy (such as people creating software).

wrs an hour ago | parent [-]

Especially considering the millions of 2026-class data center GPUs that massively overinvested companies are currently buying, which will be obsolete in a few years.

treis an hour ago | parent | next [-]

I think those are going to be run until they die. The capex vs opex is too high to obsolete them in a few years. They'll keep serving current gen LLMs for as long as they keep running.

Chu4eeno 27 minutes ago | parent [-]

They can also be used for other things than running the main frontier whatever model as well.

E.g. grok isn't truly multi-modal, it has a callable tool that is a separate VLM it invokes on image URLs or files (for a long time it was grok-1.5v, but I think they have upgraded now, it was pretty bad).

And then you have the small summarizer models for the CoT/thought traces, the guidable summarizer models for the standard browse tools, etc.

There's a ton of stuff that can use an aging GPU.

nok22kon 13 minutes ago | parent | prev [-]

H100 were released in Oct 2022. They are now more expensive than at release time.

schmuhblaster an hour ago | parent | prev | next [-]

Indeed, and with some tinkering around the harness it can even punch way above its weight.

thewebguyd an hour ago | parent | prev [-]

> You need a system with either 32+ GB VRAM

I do hope you're right that it will get cheaper over time (it should), but right now 32GB of VRAM is not affordable to a lot of people. You're talking ~$4500 just for the GPU, or $800 ish used if you can find one.

daan-k an hour ago | parent [-]

For inference you can split the 32GB between two 16GB cards. Two new 5060tis for ~€1000 in total is more than fine.

It's a tad less efficient and a bit more of a hassle, but still a good experience for only a fraction of the price.

hrjejrnrn 40 minutes ago | parent [-]

[dead]

majormajor 2 hours ago | parent | prev | next [-]

> Historically a good Windows computer cost $1000 and it was all it took to start programming.

Gotta remember inflation here.

$1K in 1995 was roughly equivalent to $2K now and wouldn't have been a particularly "good" machine then.

In 1982 the Commodore 64 started at about $600 bucks, also roughly around $2K today.

If you outgrew that, beefier machines back then were A LOT. It was easy to find $2k+ towers and (especially) laptops even into the 2000s, and a lot of those would be $5K+ equivalent today.

SoftTalker an hour ago | parent [-]

And a unix workstation in those days could be high 4 or even 5 figures, depending on configuration.

Chu4eeno 2 hours ago | parent | prev | next [-]

Open weights/source doesn't necessarily mean running on local hardware, though.

I imagine having multiple providers competing will drive down hosted versions of open weight models drastically.

abetusk an hour ago | parent | prev | next [-]

Moore's law or one of its generalizations still holds, so it will only be a short matter of time before a $1k computer will be able to train and run a powerful enough model.

Windchaser an hour ago | parent [-]

I thought Moore's Law came to an end in the last decade?

Certainly the transistors/chip or transistors/$ or flops/$ have not been progressing at the same exponential rate as during 1970-2010. There is still progress, but it's rather slower.

ssivark an hour ago | parent | prev | next [-]

I don't understand the justification for local hardware with cost as the motivation. The same (or bigger/better) open weights models can served by third parties at much higher resource utilisation, and will therefore be much cheaper!?

Especially because the world is likely to persist, at least for a while, in state where computing hardware demand drastically exceeds supply resulting in high prices for hardware. So why wouldn't you want to max out utilisation and amortize costs, at least for typical (non sensitive) use cases.

layer8 32 minutes ago | parent [-]

IMO the more useful distinction is in analogy to VPS versus SaaS/PaaS. Open models allow you to use any inference provider you like, including local ones, similar to running open-source software using VPS providers. You’re not bound to a particular SaaS/PaaS as you are with closed model providers. That same freedom also allows you to self-host when you care about that.

bensyverson 2 hours ago | parent | prev | next [-]

Yes, between Moore's Law and more efficient model architectures, we just have to let time do its work.

Danox 2 hours ago | parent [-]

Software models and hardware are getting better all the time—and that’s where some big companies spending billions might stumble! In fact, Microsoft recently announced that they’re scaling back a bit on their AI investments.

giancarlostoro 2 hours ago | parent | prev | next [-]

Before the AI "crisis" it used to take about $3500 to get a prebuilt with a 5090 which can run good enough LLMs. I run reasonable LLMs on just 16GB of VRAM on my Mac, and the 5090 has double that.

epolanski 7 minutes ago | parent | prev | next [-]

If we had 256-512 GB ram unified memories at 2022 prices, we'd be talking 1500 computers.

crazycracker an hour ago | parent | prev | next [-]

Historically the cost of compute has also gone down. Like just look at it as compared to a year ago. We have amazing open source models that can run on consumer hardware and if we go away from our obsession of using opus 4.8 or mythos for everything then it actually is super amazing to see what these open source models could do. I use qwen3.6:27b as a daily driver and I am heavily impressed with it.

Kim_Bruning 2 hours ago | parent | prev | next [-]

Roughly about Eur 3-4K right this minute I think? The graphics card, ram and storage are punishing. Under more normal circumstances (hopefully late 2027) it'd be 1500-2500 depending on what you think is realistically useful.

Possibly it's the same price range, allowing for inflation.

36 minutes ago | parent | prev | next [-]
[deleted]
35 minutes ago | parent | prev | next [-]
[deleted]
rayiner 2 hours ago | parent | prev | next [-]

Isn’t this just a bet that I’ll have an AI data center in my iPhone within 10 years? Why is that a bad bet?

mbgerring 2 hours ago | parent | prev | next [-]

About $2k in 2026 dollars and falling.

simonw 2 hours ago | parent [-]

... or rising, at least as long as there's a RAM shortage.

mbgerring 2 hours ago | parent [-]

I’d bet that there won’t be a RAM shortage for very long.

simonw 2 hours ago | parent | next [-]

The best article I've seen about that is this one by David Oks (ignore the headline, the content is much better): https://davidoks.blog/p/ai-is-killing-the-cheap-smartphone

> It was only in 2025, as memory prices began an unprecedented surge, that the memory makers started to build new fabs targeted at HBM, all slated to start producing chips in 2027 or 2028.

fellowmartian an hour ago | parent [-]

It still won’t help unless the AI bubble pops. Even old fabs will continue pumping out HBM instead of DRAM as long as hyperscalers gobble it up.

Avicebron 2 hours ago | parent | prev [-]

This seems wildly optimistic, do you have anything to support it?

swiftcoder an hour ago | parent | next [-]

The RAM shortage is predicated on both the huge datacenter buildout (many of which are already mired in delays, with a few even cancelled outright), and the massive memory purchase commitments various hyperscalers have made - hyperscalers who seem to be running short on cash lately...

AnimalMuppet an hour ago | parent | prev [-]

History? This isn't the first RAM shortage. When one happens, producers build more fabs. The fabs come online, the availability of memory shoots up, and the shortage goes away, usually replaced by a glut.

If you want to argue that this is different from all previous RAM shortages, you can, but the burden of proof is on you to show the difference.

nok22kon 9 minutes ago | parent [-]

there is a glut if demand stops.

this time demand doesn't stop. there is an exponential demand for tokens.

skydhash 28 minutes ago | parent | prev | next [-]

> Historically a good Windows computer cost $1000 and it was all it took to start programming

Started with computers around 2009 and later bought an oldish computer (a pentium 4 PC) for the equivalent of 50 usd. Codeblocks and Python Idle were free at the time (C and Python were the first languages I learned). The barrier to programming has always been low as the only thing you needed was books (the internet made things easier) and access to a PC (I had friends with laptop and my school lab).

ktallett 2 hours ago | parent | prev [-]

Hence why brute force needs to be replaced with examples such as neuromorphic methods. It could realistically could be combined with mesh networking as well to utilise the capabilities of all computers locally.