Remix.run Logo
blooalien 7 days ago

Yeah, if I remember correctly, Ollama loads models in "layers" and is capable of putting some layers in GPU RAM and the rest in regular system RAM.