| ▲ | sipjca 2 hours ago | |
fwiw because of the relatively few activated params offloading to system RAM is quite feasible, you can see the endless amount of people doing this on r/localllama with qwen3.6 35a3b | ||
| ▲ | bitwize 7 minutes ago | parent [-] | |
I ran Gemma4 26B A4B on an 8yo PC with a fucking GTX and it did rather well. | ||