Remix.run Logo
pimeys 6 hours ago

They are but from our evals for example GLM 5.2 (unquantized) performs as well as Opus but uses more tokens and takes more time.

I really wish this would change soon but they are not there yet.

klardotsh 4 hours ago | parent | next [-]

Using even double the total tokens and taking, what, 2-3x the time?, still seems worth it if prices are 5x+ cheaper (which OpenRouter [1] claims is the case).

On NeuralWatt for my personal projects at home (not affiliated, just a happy customer), I get so much more mileage out of GLM than I get out of Claude at work, specifically because it's priced as a hammer I can pound any nail-shaped-object with, not a delicacy I need to carefully budget-analyze to try to figure out if it's worth burning my monthly spend limits on this task.

https://openrouter.ai/compare/z-ai/glm-5.2/anthropic/claude-...

Den_VR 4 hours ago | parent | prev [-]

I thought true token use was being hidden by anthropic and openai both

vanviegen 4 hours ago | parent [-]

No, they do specify token counts, as they let you pay for them. They just don't tell you what these thinking tokens actually are.

girvo 3 hours ago | parent [-]

Though because they don't show you, they could be lying about it. Very unlikely, I think, would be too dangerous IMO. But technically possible