Just to clarify to people focusing on the $180/month price tag.

OpenClaw is not a CC-only product. You can configure it to use any API endpoint.

Paying $180/month to Anthropic is a personal choice, not a requirement to use OpenClaw.

So that leads to a question: Is there a physical box I could buy that an amortize over 5-7 years to be half the API cost?

In other words, assuming no price increase, 7 years of that pricing is $15k. Is there hardware I could buy for $7k or less that would be able to replace those API calls or alternativr subs entirely?

I've personally been trying to determine if I should buy a new GC on my aging desktop(s), since their graphic cards can't really handle LLMs)

▲

ekidd 4 hours ago | parent | next [-]

You can't realistically replace a frontier coding model on any local hardware that costs less than a nice house, and even then it's not going to be quite as good.

But if you don't need frontier coding abilities, there are several nice models that you can run on a video card with 24GB to 32GB of VRAM. (So a 5090 or a used 3090.) Try Gemma4 and Qwen3.5 with 4-bit quantization from Unsloth, and look at models in the 20B to 35B range. You can try before you buy if you drop $20 on OpenRouter. I have a setup like this that I built for $2500 last year, before things got expensive, and it's a nice little "home lab."

If you want to go bigger than this, you're looking at an RTX 6000 card, or a Mac Studio with 128GB to 512GB of RAM. These are outside your budget. Or you could look at a Mac Minis, DGX Spark or Strix Halo. These let you bigger models much slower, mostly.

▲

TheDong 4 hours ago | parent | prev | next [-]

You can buy a roughly $40k gpu (the h100) which will cost $100/mo in electricity on top of that to get about 30-80% the performance of OpenAI or Anthropic frontier models, depending what you're doing.

Over 5 years, that works out to ~$45k vs ~$10k, and during that duration, it's possible better open models will come available making the GPU better, but it's far more likely that the VC-fueled companies advance quicker (since that's been the trend so far).

In other words, the local economics do not work out well at a personal scale at all unless you're _really_ maxing out the GPU at close to 50% literally 24/7, and you're okay accepting worse results.

As long as proprietary models advance as quickly as they are, I think it makes no sense to try and run em locally. You could buy an H100, and suddenly a new model that's too large to run on it could be the state of the art, and suddenly the resale value plummets and it's useless compared to using this new model via APIs or via buying a new $90k GPU with twice the memory or whatever.

▲

vrganj 4 hours ago | parent [-]

This feels like it should be state infrastructure, the way roads, railroads and the postal system are.

▲

dsr_ 3 hours ago | parent | next [-]

This feels like a market which hasn't settled into long-term profitability and is being subsidized by investors.

▲

TheDong 4 hours ago | parent | prev [-]

Note that the (edit: US) postal system is a for-profit system.

Given the trends of the capitalist US government, which constantly cedes more and more power to the private sector, especially google and apple, I assume we'll end up with a state-run model infrastructure as soon as we replace the government with Google, at which point Gemini simply becomes state infrastructure.

	▲	fineIllregister 3 hours ago \| parent \| next [-]
		> Note that the (edit: US) postal system is a for-profit system. That's not correct. If USPS makes more revenue than their expenses for a year, they can't pay it out as profits to anyone. It's true that USPS is intended to be self-funded, covering it's costs through postage and services sold, and not tax revunue. That doesn't mean there's profit anywhere.
	▲	vrganj 4 hours ago \| parent \| prev [-]
		> Note that the postal system is a for-profit system. That depends on the country in question :-)

▲

zozbot234 2 hours ago | parent | prev | next [-]

For something like OpenClaw you realistically only need rather slow inference, so use SSD offload as described by adrian_b here: https://news.ycombinator.com/item?id=47832249 Though I'm not sure that the support in the main inference frameworks (and even in the GGUF format itself, at least arguably) is up to the task just yet.

▲

wasfgwp 3 hours ago | parent | prev | next [-]

You can use several times cheaper models than Claude as well, its not like you need anything big to handle all the uses cases listed above

	▲	swiftcoder 3 hours ago \| parent [-]
		Yeah, something like MiniMax m2.7 should be perfectly capable for this sort of thing, and is 10-20x cheaper

▲

rcxdude 4 hours ago | parent | prev [-]

For something the size of Claude, probably not. But for smaller models, maybe (though they also are much cheaper to buy tokens for)