Remix.run Logo
Things I Think I Think... Preferring Local OSS LLMs(blogs.newardassociates.com)
19 points by zdw 10 hours ago | 3 comments
jeromechoo 43 minutes ago | parent | next [-]

I think many developers worth their salt will argue the same. Cloud is and has always been a shortcut to buying your own hardware. Local models will get better and smaller. Qwen3-coder-next runs on a Spark and is as capable as Sonnet 4.5. Bonsai released a 1-bit model yesterday.

I also like the freedom of not having to ration a daily allowance of tokens.

farfatched 3 hours ago | parent | prev | next [-]

I'd like a local LLM too, but they're expensive (consider the opportunity cost of a GPU, if it sits idle most of the time), and produce heat and noise in places that I'm trying to cool and quiet.

I'd like a private jet too, alas.

androiddrew 3 hours ago | parent | prev [-]

I love local first. I am finding that a 120B MoE is hitting the sweet spot for local hosted. Right now that takes a 2K strix halo, a 4k GB10 machine, or a 5k Mac Pro. 2 years from now I think hardware will take us back to the 2k ish range with good performance.

I love my dual GPU setup (2AMD Radeon r9700 64GB vram) but it costs 5x electricity than my GX10 (GB10 chip inside) and since layers are landing in system memory my TPS is half the GX10.

Now a dense model like Devstral2 24B slaps on the Dual GPU setup. I just haven’t gotten as much out of that as I have the 120 MoEs