Remix.run Logo
criddell 5 days ago

If you had a $2500ish budget for hardware, what types of models could you run locally? If $2500 isn't really enough, what would it take?

Are there any tutorials you can recommend for somebody interested in getting something running locally?

cogman10 5 days ago | parent | next [-]

This is where you'd start for local: https://ollama.com/

You can, almost, convert the number of nodes to gb of memory needed. For example, Deepseek-r1:7b needs about 7gb of memory to run locally.

Context window matters, the more context you need, the more memory you'll need.

If you are looking for AI devices at $2500, you'll probably want something like this [1]. A unified memory architecture (which will mean LPDDR5) will give you the most memory for the least amount of money to play with AI models.

[1] https://frame.work/products/desktop-diy-amd-aimax300/configu...

mark_l_watson 5 days ago | parent | prev | next [-]

I bought a Mac Mini M2Pro 32G 18 months ago for $1900. It is sufficient to run good up to and including 40B local models that are quantized.

When local models don’t cut it, I like Gemini 2.5 flash/pro and gemini-cli.

There are a lot of good options for commercial APIs and for running local models. I suggest choosing a good local and a good commercial API, and spend more time building things than frequently trying to evaluate all the options.

criddell 5 days ago | parent | next [-]

Are there any particular sources you found helpful to get started?

It's been a while since I checked out Mini prices. Today, $2400 buys an M4 Pro with all the cores, 64GB RAM, and 1TB storage. That's pleasantly surprising...

mark_l_watson 5 days ago | parent [-]

You can read my book on local models with Ollama free online: https://leanpub.com/ollama/read

criddell 5 days ago | parent [-]

Awesome, thanks!

5 days ago | parent | prev [-]
[deleted]
dstryr 5 days ago | parent | prev | next [-]

I would purchase [2] used 3090's as close to $600 as you can. The 3090 still remains the price-performance king.

yieldcrv 5 days ago | parent | prev | next [-]

the local side of things with an $7,000 - $10,000 machine (512gb fast memory, cpu and disk) can almost reach parity with regard to text input and output and 'reasoning', but lags far behind for multimodal anything: audio input, voice output, image input, image output, document input.

there are no out the box solutions to run a fleet of models simultaneously or containerized either

so the closed source solutions in the cloud are light years ahead and its been this way for 15 months now, no signs of stopping

omneity 5 days ago | parent [-]

Would running vLLM in docker work for you, or do you have other requirements?

yieldcrv 5 days ago | parent [-]

its not an image and audio model, so I believe it wouldn't work for me by itself

would probably need multiple models running in distinct containers, with another process coordinating them

redox99 5 days ago | parent | prev | next [-]

Kimi and deepseek are the only models that don't feel like a large downgrade from the typical providers.

skeezyboy 5 days ago | parent | prev [-]

you can run ollama stuff with just a decent cpu for some of them