Remix.run Logo
cmrdporcupine a day ago

Not when you factory in token efficiency. It burns a lot more tokens to do the same job, so when I compared to GPT5.5 I was frankly not really much ahead, and with weaker thinking.

Maybe makes sense if you have z.AI's (not greatly priced) subscription plan, but it's not competitive against an OpenAI or Anthropic monthly coding subscription plan. I burned through almost $10 worth of tokens just doing an hour of work.

Sanzig a day ago | parent [-]

Take a look at Ollama Cloud: https://ollama.com/pricing

You get access to a whole bunch of bleeding edge open models including GLM-5.2, Kimi K2.7, DeepSeek 4 Pro, etc. Inference is run on US/SG/EU cloud providers with zero data retention policies. The $20/mo tier is very generous, in my experience.

jeremyjh a day ago | parent | next [-]

They don’t have a statement about where it is run or data retention on the GLM5.2 model. They do state that for others, like MiniMax.

Sanzig a day ago | parent [-]

There's a blanket statement at the bottom of the pricing page, which I would hope also applies to GLM-5.2:

> Where are models hosted?

> Ollama hosts models and compute resources primarily in the United States. To serve global demand, we may route to Europe and Singapore for additional capacity.

> Is my prompt or response data trained on?

> Prompt or response data is never logged or trained on.

> Who does Ollama partner with to host models?

> Ollama collaborates with NVIDIA Cloud Providers (NCPs) to host open models.

> When Ollama partners with providers, we require no logging, no training, and zero data retention policies in place.

cmrdporcupine 19 hours ago | parent | prev [-]

Well I tried the $20/mo tier and used GLM specifically and did maybe 3-4 hours of work and I'm already through 50% of my monthly tier and blew through my time limited quota twice. I won't renew for another month.

Which I think only underscores my point that actually the GLM models are not very cost effective.

They essentially cost the same as the SOTA models from OpenAI and Anthropic, while not being quite as smart. I could have gotten about the same amount of work done on the $20 Codex plan. And I had to use my $100 Codex plan to finish the work GLM started before it ran out of quota. And also to fix it since GLM left a bit of a mess.

I like that GLM exists. Other Chinese models are far more cost effective. GLM is expensive, even on a fixed plan.

jeremyjh 2 hours ago | parent [-]

Ollama can’t meaningfully subsidize their subscriptions - there is no business case to do so because they are a commodity host. If you want to compare subsidized subscription value you would need to compare with z.ai’s plans. One problem with any comparison is that they are all very opaque in terms of usage and the plans change a lot over time. I got on pro at $30 a month so it’s a very good value - compared to $20 Claude/Codex plans I get at least 10x the usage and I use all 3 regularly. At today’s prices Codex pro ($100) is likely a better value.

But if you are building a product or in an enterprise environment where you essentially have to pay API rates then GLM is the best value hands down.