Remix clone Hacker News

new | show | ask | jobs Github

▲

jklmnopqrstuvw 9 hours ago

From my own experience, GLM-5.2 generally cost more tokens and much more slow.

▲

pimeys 9 hours ago | parent | next [-]

I use GLM 5.2 Fast from Fireworks and its very fast. Where are you using it from?

▲

microtonal 9 hours ago | parent | prev | next [-]

Which inference provider do you use? (Admittedly, I currently use K2.7 a lot more currently.)

▲

james2doyle 9 hours ago | parent | prev [-]

Tokens and speed are a factor but does it require less back and forth to get things right? Being "fast and cheap but wrong" still has a cost that an otherwise "expensive and slow" exchange does not

	▲	paradox460 3 hours ago \| parent [-]
		In my experience it spends a lot more tokens to do things. I wrote a tiny extension for omp that counts the number of "Actually" in the response, and if it exceeds a threshold stops execution and waits for me to tell it what to do. Even then it frequently just ignores basic instructions like "only write boilerplate, I will fill in the functionality" Imo MiniMax and MiMo are a lot more reliable (and cheap) Not opus level, but close enough and cheap enough to get the job done