Remix.run Logo
evilduck 2 hours ago

Works great for these type of MOE models. The ability to have large amounts of VRAM let you run different models in parallel easily, or to have actually useful context sizes. Dense models can get sluggish though. AMD's ROCm support has been a little rough for Stable Diffusion stuff (memory issues leading to application stability problems) but it's worked well with LLMs, as does Vulkan.

I wish AMD would get around to adding NPU support in Linux for it though, it has more potential that could be unlocked.