Remix.run Logo
hkchad 16 hours ago

It depends. I have a M5 128 so i can play around with large models and even keep several of them loaded at once and use something like llama swap to access them all via bifrost or litellm. You won't do this without some serious local GPU's with big memory. The downside is the speed, it's not fast, but fast enough to tinker and develop with worrying about ongoing cloud cost. When done developing and you need to really scale up this is when you can swap to cloud computing and get the job done faster. My $5k macbook can do more than a $50k nvidia/intel/amd setup, just not as fast.

So you need to decide whats important to you if you want to work locally, large/many models or speed. It's the pick 2 problem speed, size, cost, pick 2 or go with cloud and accept your development time is also spent $$ on each iteration.

I was hoping for the M5 ultra by now, but looks like that's not coming until much later this year for a much higher price now.