Remix.run Logo
antisthenes 3 hours ago

What's the cheapest PC you can buy today that will comfortably run Gemma 4 and everything else you want it to run at the same time?

And how many tokens would that buy?

ls612 3 hours ago | parent [-]

I run it on my 4 year old MBP and get 10 tok/s. With the RAM shortage buying anything new today is a nightmare but anyone with a reasonably modern Mac could run it at q6 probably. It is mostly a toy as 4o models weren’t really suitable for real work IMO but at least it won’t ever give me a refusal.

jazzyjackson 2 hours ago | parent [-]

At 10toks, are you using it interactively or do you submit a prompt and come back to it later? I always thought it would make sense to just do conversations over email, asynchronously, the model can take all the time it needs and get back to me when it has an answer.

ls612 an hour ago | parent [-]

10 tok/s is around the borderline of interactive being good. I did the math and it is mostly bottlenecked by memory bandwidth, so in the future I can expect to run a similarly sized model on my 4090 once it gets retired from gaming service and get ~25 tok/s which will be very usable.