Remix.run Logo
Cakez0r 3 hours ago

202.7 tok/s is only OK speed? Which providers are you using that are significantly better than that?

mythz 39 minutes ago | parent | next [-]

I said speed was great, Cerebas and Groq can provide better performance, likewise Fast versions of Cursor's Composer and Claude.

The reported speed like benchmarks is only a reported number on paper, we'll see how it holds up in real world usage, so far OpenRouter is only reporting 73tps

[1] https://openrouter.ai/x-ai/grok-4.3

lukewarm707 4 minutes ago | parent [-]

i really don't trust openrouter numbers.

i use byok and see responses fail on openrouter while they work perfectly at the provider. the provider is often listed as 'down' and it's very clearly up on the original api and serving requests.

cerebras quotes oss 120b at 3000tps and it is under 800 on openrouter.

same with fireworks, i am getting much higher numbers not on openrouter. but recently i think fireworks deepseek is kind of spotty, the main provider i know that just doesn't go down is vertex and they charge 2-3x the rest

mritchie712 2 hours ago | parent | prev [-]

for reference, it's the 2nd fastest model tracked in the "Highlights" section of https://artificialanalysis.ai/

Cakez0r 2 hours ago | parent | next [-]

Yes, it's incredibly fast. Openrouter is clocking 60 tokens per second, which is on par with the likes of sonnet, opus, GPT 5.5.

goldenarm 2 hours ago | parent | prev [-]

That section misses Cerebras and Groq which are up to 5x faster.

Havoc 2 hours ago | parent [-]

Very different tech and limitations though so wouldn’t make sense to compare 1:1 I think

goldenarm an hour ago | parent [-]

What are the limitations ?

gslepak 38 minutes ago | parent [-]

Much smaller context