Remix.run Logo
jklmnopqrstuvw 9 hours ago

From my own experience, GLM-5.2 generally cost more tokens and much more slow.

pimeys 9 hours ago | parent | next [-]

I use GLM 5.2 Fast from Fireworks and its very fast. Where are you using it from?

microtonal 9 hours ago | parent | prev | next [-]

Which inference provider do you use? (Admittedly, I currently use K2.7 a lot more currently.)

james2doyle 9 hours ago | parent | prev [-]

Tokens and speed are a factor but does it require less back and forth to get things right? Being "fast and cheap but wrong" still has a cost that an otherwise "expensive and slow" exchange does not

paradox460 3 hours ago | parent [-]

In my experience it spends a lot more tokens to do things. I wrote a tiny extension for omp that counts the number of "Actually" in the response, and if it exceeds a threshold stops execution and waits for me to tell it what to do. Even then it frequently just ignores basic instructions like "only write boilerplate, I will fill in the functionality"

Imo MiniMax and MiMo are a lot more reliable (and cheap)

Not opus level, but close enough and cheap enough to get the job done