Remix.run Logo
dial9-1 12 hours ago

still waiting for the day I can comfortably run Claude Code with local llm's on MacOS with only 16gb of ram

bearjaws 4 hours ago | parent | next [-]

My super uninformed theory is that local LLM will trail foundation models by about 2 years for practical use.

For example right now a lot of work is being done on improving tool calling and agentic workflows, which tool calling was first popping up around end of 2023 for local LLMs.

This is putting aside the standard benchmarks which get "benchmaxxed" by local LLMs and show impressive numbers, but when used with OpenCode rarely meet expectations. In theory Qwen3.5-397B-A17B should be nearly a Sonnet 4.6 model but it is not.

rubymamis 6 hours ago | parent | prev | next [-]

Doesn't OpenCode supports local models?

g947o 4 hours ago | parent [-]

You can, but the quality sucks.

Local LLMs don't make sense for most people compared to "cloud" services, even more so for coding.

gedy 12 hours ago | parent | prev | next [-]

How close is this? It says it needs 32GB min?

HDBaseT 12 hours ago | parent | next [-]

You can run Qwen3.5-35B-A3B on 32GB of RAM sure, although to get 'Claude Code' performance, which I assume he means Sonnet or Opus level models in 2026, this will likely be a few years away before its runnable locally (with reasonable hardware).

Foobar8568 11 hours ago | parent [-]

I fully agree, I run that one with Q4 on my MBP, and the performance (including quality of response) is a let down.

I am wondering how people rave so much about local "small devices" LLM vs what codex or Claude code are capable of.

Sadly there are too much hype on local LLM, they look great for 5min tests and that's it.

brcmthrowaway 11 hours ago | parent [-]

Just train it better with AGENTS.md

Hamuko 6 hours ago | parent | prev [-]

I'm reading "more than 32GB of unified memory" to mean at least a 36 GB model.

3yr-i-frew-up 6 hours ago | parent | prev [-]

[dead]