| ▲ | reddit_clone 5 hours ago | |||||||||||||||||||
>64 GB Thats the rub. I have an M4 with 48G. I wonder if it is worth testing this out. My past attempts (with Ollama and various LLMs) were too slow to use. | ||||||||||||||||||||
| ▲ | hkchad 5 hours ago | parent | next [-] | |||||||||||||||||||
I have a M5 MAX with 128, local models are toys compared to hosted ones. I've spent a lot of time and money trying to make it work even 1/2 as well. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | dofm 3 hours ago | parent | prev | next [-] | |||||||||||||||||||
Some of these models will be a bit of a squeeze at Q4_0 I suspect; almost certainly they will be using CPU. Probably the 31B Gemma will be too much. Maybe not the Gemma-4 26B QAT. But if you just want to play around rather than code, you really might find the Gemma 4 12B model worth mucking about with just so you've gone through the steps. Especially if you want to muck about with image analysis or audio transcription. If you're writing PHP I think you could even find it good enough. I've been modestly surprised. You can do that basic fiddling with the Edge AI Gallery app, which can enable thinking and has a customisable system prompt and some agent support. You could also try the 14B Deepseek R1. Honestly even if it is not good enough, if you are anything like me, I think you'll find that going through this process is really quite educational — it has made a lot of things more concrete for me in a way that I have found reassuring and valuable. | ||||||||||||||||||||
| ▲ | codazoda 4 hours ago | parent | prev | next [-] | |||||||||||||||||||
I'm running an M3 on an Air with just 16GB. I can still get useful results without an internet connection in "chat mode". It's a different experience than using Claude, for sure, but it's workable. I typically use the Qwen variants these days. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | contingencies 4 hours ago | parent | prev [-] | |||||||||||||||||||
M4 24GB here. You'll be fine, if you're anything like me minor latency is acceptable to obtain (a) privacy (b) reliability (c) CI/CD/guardrails (d) network independence (e) future-proofing vs. AIaaS. https://omlx.ai/ gives you intelligent local hardware based model download recommendations. That said it probably depends heavily on your workload, process and polish expectations. See also https://news.ycombinator.com/item?id=48089091 | ||||||||||||||||||||