Currently the barrier to entry for local models is about $2500. Funny thing is $2500 is about the amount my parents paid for a 166 MHZ machine in 1995.

▲

thijson 4 hours ago | parent | next [-]

I remember my Dad buying a 386 25MHz a few years earlier for a similar amount.

In 1984 he bought a TRS-80 for almost a thousand dollars. 32kB RAM, around 1 MHz 8 bit CPU.

I bought a Pentium 90 in the late 90's for several thousand dollars. It had the FDIV bug in it.

After experiencing a lifetime of high depreciation in electronics, I'm extremely price sensitive when buying it. I feel that if I wait a few years everything will become much cheaper. Maybe that's not the case with the slow down in Moore's law and the AI datacenter build out.

▲

Aurornis 2 hours ago | parent | prev | next [-]

The top local mode in this benchmark is Qwen3.5-9B (Q4_K_M), which is not a big model.

9B = 9 billion parameters. Q4_K_M is the quantization which will come in somewhere around 4.5 bits per weight.

It will run well on a $500 Mac Mini.

	▲	hparadiz 42 minutes ago \| parent [-]
		I'm actually running it on my AMD 6900 XT right now with 16GB of RAM but looking at my options for upgrading my local model. Can't say I'm a fan of these entry level machines to be honest. I wanna be able to run it with 100k context.

▲

brandall10 5 hours ago | parent | prev | next [-]

My first 'real' machine was a Price Club (now Costco) 386sx for $3800 in late '89, which would be nearly $10k adjusted for inflation. 16 MHz, 1 MB RAM, 40 MB hard disk.

That was bargain basement for that era. IBMs, Compaqs and the like were ~$5k similarly configured, and the first 486s were in the $7-9k area.

▲

hparadiz 5 hours ago | parent [-]

This picture of the Ryzen AI Max+ blew my mind.

https://images.prismic.io/frameworkmarketplace/Z7aVJZ7c43Q3f...

Look this isn't an ad. I've been building my own desktops since I was 14. It's always been a CPU and motherboard and memory separate type of deal but this thing has it all integrated. Look how small it is. I use Gentoo. I compile all the things. I know exactly how long it takes to compile gcc because I do it all the time.

This thing compiles the linux kernel in 62 seconds. And it uses less power than my current machine to do it. I am jealous. The computer age is not slowing down. It's in fact speeding up. Am I the only one excited as fuck about what's coming?

You don't even need a GPU because it handles gaming tasks like it's nothing.

	▲	bakies 3 hours ago \| parent [-]
		I bought one of those 6 months ago when the top spec was $2k, now it's $2700 yikes. Very happy with my purchase. I picked this precisely because it's the only non-apple with that unified architecture for memory. I still wanted to put kubernetes on it so it's important it's not a Mac.

▲

5 hours ago | parent | prev | next [-]

[deleted]

▲

aegis_camera 5 hours ago | parent | prev | next [-]

Entry level is actually MAC MINI 16GB at <$499, I have models running on M2 MINI 16GB, it's working with small models.

▲

bigyabai 5 hours ago | parent [-]

If "small models" is the bar, then you can run inference for ~$50 on Raspberry Pi like hardware. I do that with 1.8b-4b models.

▲

aegis_camera 5 hours ago | parent [-]

LFM 450M for vision task, QWEN 9B Q4 for Orchestration, this provides a good result.

	▲	hparadiz 5 hours ago \| parent [-]
		I actually meant a context window of about 50k which is what you need to run OpenClaw well.

▲

segmondy 6 hours ago | parent | prev | next [-]

This is very false. My first system was a 3060 which you can buy new for about $300 or used for about $200. If you already have an existing system you can use it, else you can pick up a used PC for about $150. Entry is about $500.

▲

johndough 5 hours ago | parent [-]

Perhaps OP was referring to a usable agentic system, for which $2500 sounds about right.

I've got a 3060 myself, which is nice to play around with the smaller models for free (minus electricity) and with 100% uptime, but I was not able to program anything with them yet that I didn't want to rewrite completely. A heavily quantized Qwen3.5-27B model is getting close though. Maybe in a few months.

▲

hparadiz 5 hours ago | parent | next [-]

I was actually thinking of the AMD Ryzen AI Max+ 395 which compiles the linux kernel in 62 seconds and is the first usable integrated graphics solution I've seen.

Benchmarks: https://old.reddit.com/r/LocalLLaMA/comments/1rpw17y/ryzen_a...

	▲	0xbadcafebee 5 hours ago \| parent \| next [-]
		What does usable mean? There have been laptops and desktops with AI-capable iGPUs and 96-12GB RAM for 2 years.
	▲	aegis_camera 5 hours ago \| parent \| prev [-]
		This a good platform. I was thinking about to get one

▲

0xbadcafebee 5 hours ago | parent | prev | next [-]

Strix Halo systems were ~$1500. They've gone up in price due to demand, but that is a perfectly usable "agentic system" (whatever that means). If 128GB VRAM and a fast GPU isn't good enough, I don't know what is.

	▲	johndough 4 hours ago \| parent [-]
		> Strix Halo systems were ~$1500. They've gone up in price due to demand The price hike has been crazy. The Bosgame M5 Mini is $2400 now. I didn't get one last year when they were $1500 because I thought the memory bandwidth was mediocre. However, it doesn't look like we'll get anything better for that price anytime soon.

▲

aegis_camera 5 hours ago | parent | prev [-]

I have also 4070 laptop version during heavy discount season, before 50series came. And upgrade to 96GB DDR5 when it's cheap... So I like LFM 450M + QWEN 9B Q4, they are good fit to 8GB VRAM.

▲

BoredPositron 5 hours ago | parent | prev [-]

The used model is 9B even with a big context you can easily run it on 16GB. You don't need a $2500 machine for it.

▲

hparadiz 5 hours ago | parent [-]

For coding and personal assistance the context window on 16GB is not good enough. Ideally I want a context window of 100k.

▲

Aurornis 2 hours ago | parent | next [-]

This is starting to feel like a conversation where the goal posts keep moving, but a Mac Mini with 32GB of RAM starts at $999

▲

BoredPositron 5 hours ago | parent | prev [-]

In the other reply you said 50k. 16GB vram provides 40-70k on the 9b depending on the implementation and quant. Which is more than enough for the tool we are discussing in this thread but it looks like you are just changing your story instead of admitting that your initial comment was made in a hunch. Adding ever changing context in responses "to be right" is just bad manner.

	▲	hparadiz 5 hours ago \| parent [-]
		50k is what I consider bare minimum but I would like to have 100k. Honestly I'd like to have as much as I can get. Context window is what makes it useful. I wanna feed it all the information at the same time. If I can feed it my entire code base it becomes much more useful than if I feed it only some of my code base.