| ▲ | DrBenCarson 7 hours ago |
| How are you using that RAM with the GPU? |
|
| ▲ | canpan 7 hours ago | parent [-] |
| Llama.cpp with automatic offload to main memory. You can also use Ollama, it is easier, but slower. |
| |
| ▲ | reverius42 2 hours ago | parent [-] | | For those who want a GUI, LM Studio does this too (with llama.cpp as the backend I think). I'm getting great (albeit slow) results with Qwen3.6-35B MoE on 8GB GPU RAM, 40GB system RAM. |
|