| ▲ | allovertheworld 2 hours ago | |
How cheap is glm at Cerebras? I cant imagine why they cant tune the tokens to be lower but drastically reduce the power, and thus the cost for the API | ||
| ▲ | Zetaphor 2 hours ago | parent [-] | |
They're running on custom ASICs as far as I understand, it may not be possible to run them effectively at lower clock speeds. That and/or the market for it doesn't exist in the volume required to be profitable. OpenAI has been aggressively slashing its token costs, not to mention all the free inference offerings you can take advantage of | ||