steve1977 17 hours ago

Unified memory on Apple Silicon. On PC architecture, you have to shuffle around stuff between the normal RAM and the GPU RAM.

Mac mini just happens to be the cheapest offering to get this.

▲

phil21 3 hours ago | parent | next [-]

Local LLM is so utterly slow even with multiple $3,000+ modern GPUs operating in the giant context windows openclaw generally works with that I doubt anyone using it is doing so.

Local LLM from my basic messing around is a toy. I really wanted to make it work and was willing to invest 5 figures into it if my basic testing showed promise - but it’s utterly useless for the things I want to eventually bring to “prod” with such a setup. Largely live devops/sysadmin style tasking. I don’t want to mess around hyper-optimizing the LLM efficiency itself.

I’m still learning so perhaps I’m totally off base - happy to be corrected - but even if I was able to get a 50x performance increase at 50% of the LLM capabilities it would be a non-starter due to speed of iteration loops.

With opelclaw burning 20-50M/tokens a day with codex just during “playing around in my lab” stage I can’t see any local LLM short of multiple H200s or something being useful, even as I get more efficient with managing my context.

▲

cromka 16 hours ago | parent | prev | next [-]

But the only cheap option is 16GB basic tier Mac Mini. That's not a lot of shared memory. Proces increase bery quickly for expanded memory models.

▲

WA 16 hours ago | parent | next [-]

Why though? The context window is 1 millions token max so far. That is what, a few MB of text? Sounds like I should be able to run claw on a raspberry pi.

	▲	tjchear 9 hours ago \| parent [-]
		If you’re using it with a local model then you need a lot of GPU memory to load up the model. Unified memory is great here since you can basically use almost all the RAM to load the model.

▲

steve1977 16 hours ago | parent | prev [-]

I meant cheap in the context of other Apple offerings. I think Mac Studios are a bit more expensive in comparable configurations and with laptops you also pay for the display.

▲

yberreby 5 hours ago | parent | prev [-]

Sure, but aren't most people running the *Claw projects using cloud inference?