Remix.run Logo
ac29 7 hours ago

GLM 5.1 is pretty good, probably the best non-US agentic coding model currently available. But both GLM 5.0 and 5.1 have had issues with availability and performance that makes them frustrating to use. Recently GLM 5.1 was also outputting garbage thinking traces for me, but that appears to be fixed now.

cmrdporcupine 7 hours ago | parent [-]

Use them via DeepInfra instead of z.ai. No reliability issues.

https://deepinfra.com/zai-org/GLM-5.1

Looks like fp4 quantization now though? Last week was showing fp8. Hm..

wolttam 7 hours ago | parent [-]

Deepinfra's implementation of it is not correct. Thinking is not preserved, and they're not responding to my submitted issue about it.

I also regularly experience Deepinfra slow to an absolute crawl - I've actually gotten more consistent performance from Z.ai.

I really liked Deepinfra but something doesn't seem right over there at the moment.

cmrdporcupine 6 hours ago | parent [-]

Damn. Yeah, that sucks. I did play with it earlier again and it did seem to slow down.

It's frankly a bummer that there's not seemingly a better serving option for GLM 5.1 than z.AI, who seems to have reliability and cost issues.