| ▲ | data-ottawa 18 hours ago | |
I have the Framework Desktop with 395+ 128gb RAM Today I am pretty happy with it. LLMs are finally good enough (fast enough with MTP+MoE, but also just much better in capability) that I can fit local ones into real tasks, and I've used image generation with invokeAI to do some genuinely useful things like rendering concepts for a renovation. I mostly use lemonade-server and invokeAI for my workloads, previously I used llama-swap, but lemonade is just an easier to manage system. ROCm is finally usable. Up until end of Q1 2026 it felt like a total waste of money largely due to AMD. ROCm was unusable all of last yera; there was an entire month where PyTorch crashed just trying to multiply two matrices due to AMD Linux driver issues. kyuz0's toolboxes were the only way to do anything really on the machine. Thankfully things are in a good state now, finally. I probably actually only need ~64gb of ram. There aren't a ton of high parameter count MoE with a small enough active set that it feels nice to use. But it is nice I can have many models or different modalities in memory at the same time, which is what the LMX Omni "models" do. The numbers in the article for gptOSS feel a little irrelevant now. Prompt processing is definitely an issue, and diffusion is very very slow. PP speed hits hard you if you run an agent and try to compact context. Realistically most files are not large enough that it's a huge deal, but it does make large-scale agentic work slow. | ||
| ▲ | Neywiny 14 hours ago | parent [-] | |
With 64GB it looks like I can't run 120b models which is a problem for me. Maybe 96 would be enough | ||