Remix.run Logo
Tepix 3 hours ago

If you want decent performance (more than say 20 tokens/s) for your dev team, you absolutely do need all of the model in VRAM.