| ▲ | gandreani 3 hours ago | ||||||||||||||||||||||||||||||||||
Using gpt-5.4-mini in off-peak hours already feels like super-speed to me. That's probably no more than 100-150 tk/s. I can't imagine 750! I've always eyed Cerebras but never had a use for it that would justify paying for the API directly. Although now that I think about it, trying out the API would probably cost less than a subscription for a month... | |||||||||||||||||||||||||||||||||||
| ▲ | jasonjmcghee 2 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
Try gpt-5.3-codex-spark - it's 1000 TPS and from my experience more capable than 5.4 mini. If you have a subscription it's a different pool of usage. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| ▲ | embedding-shape 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||
The ChatGPT subscription gives you access to the -spark model(s) in Codex which are blazing fast (but pretty dumb) which I think runs on Cerebras hardware too. | |||||||||||||||||||||||||||||||||||
| ▲ | kegs_ 3 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
I have a pretty good use case for gpt-oss. The amount of time savings has actually been wild. Definitely worth a try. Just to be clear, it gets like 2000tok/s | |||||||||||||||||||||||||||||||||||